The cell line transcriptome
The word transcriptome refers to the full set of RNA molecules that are transcribed from the genome in a population of cells, or in a specific cell, at a given time point. In contrast to the genome, which is characterized by its stability across different cell types within an organism, the transcriptome varies greatly between cell types, developmental stages, and in response to internal or external cues. The plastic nature of the transcriptome, and its potential to serve as a proxy for cellular identity and diversity, makes it appealing to study and the advances in high-throughput technologies has made it possible to analyze RNA expression in great detail.
In the Cell Atlas, the expression of 19670 protein-coding genes are analyzed by RNA sequencing of mRNA extracted from unsynchronized log phase growing cells. The expression level of gene-specific transcripts are given as normalized expression (NX) values, and transcripts with NX values ≥1 are considered as detected. Genes are then classified according to the specificity and distribution of mRNA expression across a panel of 69 different human cell lines (Figure 1, Thul PJ et al. (2017)).
The Cell Atlas presents RNA expression for 98% (n=19242) of all protein-coding human genes, which can be used for various analyses of transcriptomics, as well as a resource for selection of cell lines expressing particular genes of interest.
A diversity of cell lines
The 69 different cell lines used in the Cell Atlas have been selected to represent various cell populations in different tissue types and organs of the human body. The selection also aims at mimicking to the origin and phenotype of solid cancer types represented in the Pathology Atlas (Uhlen et al., 2017), abut with an additional emphasis on cancer cell types in the hematopoietic and immune systems. In addition to cancer-derived cell lines, there is a number of cell lines that have been generated through in vitro protocols for immortalization of normal cells, some primary cell lines and one type of induced pluripotent stem cells. Details regarding the different cell lines can be found here.
Cell lines are adapted to cultivation in vitro and many of the cell lines used in the Cell Atlas are human cancer cell lines. While this in some aspects limit their ressemblance to normal human cells in the context of tissues and organs, unbiased hierarchical clustering of global RNA expression (Figure 1) shows that the cell lines cluster well in agreement with similarities in origin and phenotype of the cancer cells from which thy are derived. Groups of related cell lines, such as the immortalized and transformed fibroblastic cell lines (BJ derivatives), the glioma cell lines(U-138 MG and U-251 MG), the melanoma cell lines (WM-115 and SK-MEL-30), the breast cancer cell lines (SK-BR-3, MCF7 and T47d) and the endothelial cell lines (TIME and HUVEC), cluster closely together. At the highest level of separation, cell lines that grow in solution and also represent hematopoietic and lymphoid cell systems cluster together and separate into two major clusters dependent on their myeloid or lymphoid origin/phenotype.
Figure 1. Hierarchical clustering based on RNA sequencing data for the 69 cell lines. The color of the cell line name represents its origin: Grey - Lymphoid, Light red - Muscle, Dark red - Myeloid, Bright green - Mesenchymal, Green - Pancreas, Dark green - Lung, Yellow bold - Brain, Yellow thin - Eye, Light pink - Proximal digestive tract, Pink - Female reproductive system, Dark pink - Endothelial, Beige - Skin, Orange - Kidney and urinary bladder, Blue - Gastrointestinal tract, Light blue - Male reproductive system, Light purple - Liver and gallbladder.
Specificity of RNA expression
Approximately one third of all protein-coding genes (n=6186) are expressed in all cell lines, which is indicative of roles in fundamental cellular functions, or 'house-keeping' functions, for the corresponding proteins (Figure 2). In contrast, 2% (n=428) of all protein-coding genes were not detected in any of the analyzed cell lines, suggesting that the corresponding proteins are only expressed in unrepresented cell types, during specific developmental stages or under specific conditions, such as cellular stress. 1640 of the protein-coding genes display high RNA expression in a single cell line, while 1517 display high RNA expression in a smaller group of cell lines, relative to any of the other cell lines. 8849 of the protein-coding genes show elevated RNA expression in a group of cell lines compared to the average expression in all other cell lines. Table 1 shows the distribution of genes within these expression categoried for each of the analyzed cell lines.
Figure 2. Pie chart showing the number of genes in the different RNA-based categories of gene expression in the panel of cell lines.
Table 1. Table showing the number of detected genes per cell line based on RNA sequencing (NX ≥1), and the number of genes in the enriched and enhanced categories.
The cell line transcriptomes have been compared to the bulk transcriptomes of 37 different normal tissues and organs analyzed in the Tissue Atlas (Uhlén M et al. (2015)).There are 65 protein-coding genes that are only expressed in the panel of cell lines and not detected in any of the analyzed normal tissue types, while there are 277 protein-coding genes that are only expressed in normal human tissues and not detected in any of the analyzed cell lines. Several of the proteins in the latter category encode proteins that have functions associated with differentiated cells in specialized tissues or subcompartments of tissues, which are not represented in the cell line panel. One example is ADAM30, which is expressed in spermatids of human testis.
- 65 genes found only in cell lines and not tissues
- 277 genes found only in tissues and not cell lines
Cell line enriched genes
Overall, there is a large degree of agreement between the RNA expression categories in cell lines and tissues. A majority of the cell line enriched genes, defined as having at least four times higher RNA expression in a single cell line compared to any other cell line, also belong to the tissue elevated gene expression categories (tissue enriched, group enriched and tissue enhanced). For example, the secreted proteins AHSG and ALB that are only expressed in normal liver tissue, are also highly enriched in the liver derived cell line Hep-G2, where immunofluorescent analysis shows localizations to the secretory pathway. The transcription factor HOXB13 that shows expression inthe prostate, colon and rectum, is also enriched in the prostate-derived cell line PC-3, where it is localized to the nucleoplasm. The adhesion glycoprotein CDH15 that is enriched in skeletal muscle tissue is also enriched in the sarcoma cell line RH-30, with some expression in the other sarcoma cell line LHCN-M2. The enzyme TYR that is exclusively expressed in skin is highly enriched in the melanoma-derived skin cell line SK-MEL-30, while the epidermal growth factor receptor EGFR that is enriched in female tissues and skin, is enriched in the other skin-derived cell line A-431. The expression pattern in normal tissues and function of these proteins relate to the specific traits and functions of the corresponding normal tissue type and organ.
AHSG
ALB
HOXB13
AHSG - Hep G2
ALB - Hep G2
HOXB13 - PC-3
CDH15
TYR
EGFR
CDH15 - RH-30
TYR - SK-MEL-30
EGFR - A-431
Figure 3. Examples of proteins with enriched expression in a cell line and the corresponding tissue of origin. The proteins are AHSG, ALB, HOXB13, CDH15, TYR, and EGFR. The immunohistochemical (IHC) staining shows the protein expression pattern in tissue in brown. The immunofluorescent (IF) staining shows the protein subcellular expression pattern in cell lines in green. The nucleus and microtubules are shown in blue and red respectively in the IF images.
Relevant links and publications
Parikh K et al., Colonic epithelial cell diversity in health and inflammatory bowel disease. Nature. (2019)
PubMed: 30814735 DOI: 10.1038/s41586-019-0992-y
Menon M et al., Single-cell transcriptomic atlas of the human retina identifies cell types associated with age-related macular degeneration. Nat Commun. (2019)
PubMed: 31653841 DOI: 10.1038/s41467-019-12780-8
Wang L et al., Single-cell reconstruction of the adult human heart during heart failure and recovery reveals the cellular landscape underlying cardiac function. Nat Cell Biol. (2020)
PubMed: 31915373 DOI: 10.1038/s41556-019-0446-7
Wang Y et al., Single-cell transcriptome analysis reveals differential nutrient absorption functions in human intestine. J Exp Med. (2020)
PubMed: 31753849 DOI: 10.1084/jem.20191130
Liao J et al., Single-cell RNA sequencing of human kidney. Sci Data. (2020)
PubMed: 31896769 DOI: 10.1038/s41597-019-0351-8
MacParland SA et al., Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations. Nat Commun. (2018)
PubMed: 30348985 DOI: 10.1038/s41467-018-06318-7
Vieira Braga FA et al., A cellular census of human lungs identifies novel cell states in health and in asthma. Nat Med. (2019)
PubMed: 31209336 DOI: 10.1038/s41591-019-0468-5
Vento-Tormo R et al., Single-cell reconstruction of the early maternal-fetal interface in humans. Nature. (2018)
PubMed: 30429548 DOI: 10.1038/s41586-018-0698-6
Qadir MMF et al., Single-cell resolution analysis of the human pancreatic ductal progenitor cell niche. Proc Natl Acad Sci U S A. (2020)
PubMed: 32354994 DOI: 10.1073/pnas.1918314117
Solé-Boldo L et al., Single-cell transcriptomes of the human skin reveal age-related loss of fibroblast priming. Commun Biol. (2020)
PubMed: 32327715 DOI: 10.1038/s42003-020-0922-4
Henry GH et al., A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra. Cell Rep. (2018)
PubMed: 30566875 DOI: 10.1016/j.celrep.2018.11.086
Chen J et al., PBMC fixation and processing for Chromium single-cell RNA sequencing. J Transl Med. (2018)
PubMed: 30016977 DOI: 10.1186/s12967-018-1578-4
Guo J et al., The adult human testis transcriptional cell atlas. Cell Res. (2018)
PubMed: 30315278 DOI: 10.1038/s41422-018-0099-2
Uhlen M et al., A proposal for validation of antibodies. Nat Methods. (2016)
PubMed: 27595404 DOI: 10.1038/nmeth.3995
Stadler C et al., Systematic validation of antibody binding and protein subcellular localization using siRNA and confocal microscopy. J Proteomics. (2012)
PubMed: 22361696 DOI: 10.1016/j.jprot.2012.01.030
Poser I et al., BAC TransgeneOmics: a high-throughput method for exploration of protein function in mammals. Nat Methods. (2008)
PubMed: 18391959 DOI: 10.1038/nmeth.1199
Skogs M et al., Antibody Validation in Bioimaging Applications Based on Endogenous Expression of Tagged Proteins. J Proteome Res. (2017)
PubMed: 27723985 DOI: 10.1021/acs.jproteome.6b00821
Takahashi H et al., 5' end-centered expression profiling using cap-analysis gene expression and next-generation sequencing. Nat Protoc. (2012)
PubMed: 22362160 DOI: 10.1038/nprot.2012.005
Lein ES et al., Genome-wide atlas of gene expression in the adult mouse brain. Nature. (2007)
PubMed: 17151600 DOI: 10.1038/nature05453
Kircher M et al., Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. (2012)
PubMed: 22021376 DOI: 10.1093/nar/gkr771
Pollard TD et al., Actin, a central player in cell shape and movement. Science. (2009)
PubMed: 19965462 DOI: 10.1126/science.1175862
Mitchison TJ et al., Actin-based cell motility and cell locomotion. Cell. (1996)
PubMed: 8608590
Pollard TD et al., Molecular Mechanism of Cytokinesis. Annu Rev Biochem. (2019)
PubMed: 30649923 DOI: 10.1146/annurev-biochem-062917-012530
dos Remedios CG et al., Actin binding proteins: regulation of cytoskeletal microfilaments. Physiol Rev. (2003)
PubMed: 12663865 DOI: 10.1152/physrev.00026.2002
Campellone KG et al., A nucleator arms race: cellular control of actin assembly. Nat Rev Mol Cell Biol. (2010)
PubMed: 20237478 DOI: 10.1038/nrm2867
Rottner K et al., Actin assembly mechanisms at a glance. J Cell Sci. (2017)
PubMed: 29032357 DOI: 10.1242/jcs.206433
Bird RP., Observation and quantification of aberrant crypts in the murine colon treated with a colon carcinogen: preliminary findings. Cancer Lett. (1987)
PubMed: 3677050 DOI: 10.1016/0304-3835(87)90157-1
HUXLEY AF et al., Structural changes in muscle during contraction; interference microscopy of living muscle fibres. Nature. (1954)
PubMed: 13165697
HUXLEY H et al., Changes in the cross-striations of muscle during contraction and stretch and their structural interpretation. Nature. (1954)
PubMed: 13165698
Svitkina T., The Actin Cytoskeleton and Actin-Based Motility. Cold Spring Harb Perspect Biol. (2018)
PubMed: 29295889 DOI: 10.1101/cshperspect.a018267
Kelpsch DJ et al., Nuclear Actin: From Discovery to Function. Anat Rec (Hoboken). (2018)
PubMed: 30312531 DOI: 10.1002/ar.23959
Malumbres M et al., Cell cycle, CDKs and cancer: a changing paradigm. Nat Rev Cancer. (2009)
PubMed: 19238148 DOI: 10.1038/nrc2602
Massagué J., G1 cell-cycle control and cancer. Nature. (2004)
PubMed: 15549091 DOI: 10.1038/nature03094
Hartwell LH et al., Cell cycle control and cancer. Science. (1994)
PubMed: 7997877 DOI: 10.1126/science.7997877
Barnum KJ et al., Cell cycle regulation by checkpoints. Methods Mol Biol. (2014)
PubMed: 24906307 DOI: 10.1007/978-1-4939-0888-2_2
Weinberg RA., The retinoblastoma protein and cell cycle control. Cell. (1995)
PubMed: 7736585 DOI: 10.1016/0092-8674(95)90385-2
Morgan DO., Principles of CDK regulation. Nature. (1995)
PubMed: 7877684 DOI: 10.1038/374131a0
Teixeira LK et al., Ubiquitin ligases and cell cycle control. Annu Rev Biochem. (2013)
PubMed: 23495935 DOI: 10.1146/annurev-biochem-060410-105307
King RW et al., How proteolysis drives the cell cycle. Science. (1996)
PubMed: 8939846 DOI: 10.1126/science.274.5293.1652
Cho RJ et al., Transcriptional regulation and function during the human cell cycle. Nat Genet. (2001)
PubMed: 11137997 DOI: 10.1038/83751
Whitfield ML et al., Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol Biol Cell. (2002)
PubMed: 12058064 DOI: 10.1091/mbc.02-02-0030.
Boström J et al., Comparative cell cycle transcriptomics reveals synchronization of developmental transcription factor networks in cancer cells. PLoS One. (2017)
PubMed: 29228002 DOI: 10.1371/journal.pone.0188772
Lane KR et al., Cell cycle-regulated protein abundance changes in synchronously proliferating HeLa cells include regulation of pre-mRNA splicing proteins. PLoS One. (2013)
PubMed: 23520512 DOI: 10.1371/journal.pone.0058456
Ohta S et al., The protein composition of mitotic chromosomes determined using multiclassifier combinatorial proteomics. Cell. (2010)
PubMed: 20813266 DOI: 10.1016/j.cell.2010.07.047
Ly T et al., A proteomic chronology of gene expression through the cell cycle in human myeloid leukemia cells. Elife. (2014)
PubMed: 24596151 DOI: 10.7554/eLife.01630
Pagliuca FW et al., Quantitative proteomics reveals the basis for the biochemical specificity of the cell-cycle machinery. Mol Cell. (2011)
PubMed: 21816347 DOI: 10.1016/j.molcel.2011.05.031
Ly T et al., Proteomic analysis of the response to cell cycle arrests in human myeloid leukemia cells. Elife. (2015)
PubMed: 25555159 DOI: 10.7554/eLife.04534
Dueck H et al., Variation is function: Are single cell differences functionally important?: Testing the hypothesis that single cell variation is required for aggregate function. Bioessays. (2016)
PubMed: 26625861 DOI: 10.1002/bies.201500124
Snijder B et al., Origins of regulated cell-to-cell variability. Nat Rev Mol Cell Biol. (2011)
PubMed: 21224886 DOI: 10.1038/nrm3044
Thul PJ et al., A subcellular map of the human proteome. Science. (2017)
PubMed: 28495876 DOI: 10.1126/science.aal3321
Cooper S et al., Membrane-elution analysis of content of cyclins A, B1, and E during the unperturbed mammalian cell cycle. Cell Div. (2007)
PubMed: 17892542 DOI: 10.1186/1747-1028-2-28
Davis PK et al., Biological methods for cell-cycle synchronization of mammalian cells. Biotechniques. (2001)
PubMed: 11414226 DOI: 10.2144/01306rv01
Domenighetti G et al., Effect of information campaign by the mass media on hysterectomy rates. Lancet. (1988)
PubMed: 2904581 DOI: 10.1016/s0140-6736(88)90943-9
Scialdone A et al., Computational assignment of cell-cycle stage from single-cell transcriptome data. Methods. (2015)
PubMed: 26142758 DOI: 10.1016/j.ymeth.2015.06.021
Sakaue-Sawano A et al., Visualizing spatiotemporal dynamics of multicellular cell-cycle progression. Cell. (2008)
PubMed: 18267078 DOI: 10.1016/j.cell.2007.12.033
Grant GD et al., Identification of cell cycle-regulated genes periodically expressed in U2OS cells and their regulation by FOXM1 and E2F transcription factors. Mol Biol Cell. (2013)
PubMed: 24109597 DOI: 10.1091/mbc.E13-05-0264
Semple JW et al., An essential role for Orc6 in DNA replication through maintenance of pre-replicative complexes. EMBO J. (2006)
PubMed: 17053779 DOI: 10.1038/sj.emboj.7601391
Kilfoil ML et al., Stochastic variation: from single cells to superorganisms. HFSP J. (2009)
PubMed: 20514130 DOI: 10.2976/1.3223356
Ansel J et al., Cell-to-cell stochastic variation in gene expression is a complex genetic trait. PLoS Genet. (2008)
PubMed: 18404214 DOI: 10.1371/journal.pgen.1000049
Colman-Lerner A et al., Regulated cell-to-cell variation in a cell-fate decision system. Nature. (2005)
PubMed: 16170311 DOI: 10.1038/nature03998
Liberali P et al., Single-cell and multivariate approaches in genetic perturbation screens. Nat Rev Genet. (2015)
PubMed: 25446316 DOI: 10.1038/nrg3768
Elowitz MB et al., Stochastic gene expression in a single cell. Science. (2002)
PubMed: 12183631 DOI: 10.1126/science.1070919
Kaern M et al., Stochasticity in gene expression: from theories to phenotypes. Nat Rev Genet. (2005)
PubMed: 15883588 DOI: 10.1038/nrg1615
Bianconi E et al., An estimation of the number of cells in the human body. Ann Hum Biol. (2013)
PubMed: 23829164 DOI: 10.3109/03014460.2013.807878
Malumbres M., Cyclin-dependent kinases. Genome Biol. (2014)
PubMed: 25180339
Collins K et al., The cell cycle and cancer. Proc Natl Acad Sci U S A. (1997)
PubMed: 9096291
Zhivotovsky B et al., Cell cycle and cell death in disease: past, present and future. J Intern Med. (2010)
PubMed: 20964732 DOI: 10.1111/j.1365-2796.2010.02282.x
Cho RJ et al., A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell. (1998)
PubMed: 9702192
Spellman PT et al., Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. (1998)
PubMed: 9843569
Orlando DA et al., Global control of cell-cycle transcription by coupled CDK and network oscillators. Nature. (2008)
PubMed: 18463633 DOI: 10.1038/nature06955
Rustici G et al., Periodic gene expression program of the fission yeast cell cycle. Nat Genet. (2004)
PubMed: 15195092 DOI: 10.1038/ng1377
Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419
Cellosaurus