The cell cycle dependent transcriptome and proteome

The cell cycle is an ordered and tightly regulated series of events, over which the cell grows and divides into two daughter cells. It consists of four stages, during which the cell increases in size (G1), replicates its genome (S), increases further in size and prepares for mitosis (G2), and finally goes through mitosis as well as cytokinesis (M). Depending on external and internal signals, the cell may also exit the replicative cell cycle from G1 and enter a non-replicative resting state (G0). Dysregulation of the cell cycle is known to have devastating consequences, such as uncontrolled cell proliferation, genomic instability (Malumbres M et al. (2009)), and cancer (Massagué J. (2004); Hartwell LH et al. (1994)). Therefore, the cell cycle needs to be tightly controlled, while at the same time remaining responsive to various intracellular and extracellular signals (Barnum KJ et al. (2014)). The cell cycle control system involves an intricate network of proteins that are tightly regulated by mechanisms such as transcriptional regulation (Weinberg RA. (1995)), protein post-translational modifications (PTMs) (Morgan DO. (1995)), and protein degradation (Teixeira LK et al. (2013); King RW et al. (1996)).

In asynchronous cell cultures, the cell cycle is a fundamental source of cell-to-cell variation in both transcript and protein abundances (Cho RJ et al. (2001); Whitfield ML et al. (2002); Boström J et al. (2017); Lane KR et al. (2013); Ohta S et al. (2010); Ly T et al. (2014); Pagliuca FW et al. (2011); Ly T et al. (2015)). The Cell Atlas provides a resource to explore protein heterogeneity at the single cell level in unperturbed log-phase growing cells. Among the 12813 genes in the Cell Atlas, a quarter (25%, n=3141) show cell-to-cell variation in terms of expression level and/or spatial distribution of the encoded protein(s) in at least one cell line. For 1149 of these, the temporal protein and RNA expression patterns have been further characterized in individual cells using the Fluorescent Ubiquitination-based Cell Cycle Indicator (FUCCI) U-2 OS cell line (Mahdessian D, Cesnik AJ et al. (2021)). In total, there are 529 genes encoding proteins identified to be cell cycle dependent (CCD), including 222 in mitotic structures, 318 in interphase and 11 in both. Furthermore, among the 13450 genes found to be expressed in FUCCI U-2 OS, there are 401 genes encoding CCD transcripts. This spatially resolved proteomic map of the cell cycle has been integrated into the Cell Atlas in order to provide a resource for molecular insights into the human cell cycle and cellular proliferation.

Single-cell variation in the Cell Atlas

Genetically identical cells may exhibit differences in their patterns of gene- and protein expression. This phenomenon is often referred to as cell-to-cell variation or single-cell variation (SCV). While it is hypothesized that there is an underlying functional importance to this variability, the scale and significance of variations at the single-cell level remains poorly understood (Dueck H et al. (2016)). Environmental changes, DNA damage, cell cycle progression, and stochasticity are examples of factors that may cause changes in RNA and protein expression within isogenic cell populations, and thus serve as sources of single-cell heterogeneity (Snijder B et al. (2011)). This may create different phenotypic characteristics within individual cells and provide them with a molecular and phenotypic fingerprint. Identification of all human proteins that display single-cell variation lays a foundation for characterizing the driving forces of single-cell heterogeneity, and for understanding the functional consequences.

In an immunofluorescence (IF) image, single-cell protein variations can be observed as differences in the staining intensity or spatial distribution between cells, as exemplified in Figure 1. Interestingly, as many as 25% (n=3141) of all human proteins localized in the Cell Atlas show single-cell variations (Thul PJ et al. (2017)). Of these, 2959 proteins show variations in expression level (staining intensity), and 211 proteins show variations in spatial distribution.


GTPBP8 - U-2 OS

CLCN6 - U-2 OS

INCENP - MCF7


RACGAP1 - U-2 OS

RRM2 - U-2 OS

KIF20A - U-2 OS


DUSP18 - A-431

DUSP19 - SK-MEL-30

CCNB1 - U-2 OS

Figure 1. Examples of proteins showing single-cell variation. GTPBP8 is a GTP binding protein (detected in U-2 OS cells). CLCN6 is a chloride transport protein (detected in U-2 OS cells). INCENP is a component of the chromosomal passenger complex (CPC) that is a key regulator of mitosis (detected in MCF7). RACGAP1 plays key roles in controlling cell growth and cell division (detected in U-2 OS cells). RRM2 provides precursors necessary for DNA synthesis (detected in U-2 OS cells). KIF20A is a mitotic kinesin required for cytokinesis (detected in U-2 OS cells). DUSP18 and DUSP19 are phosphatases (detected in A-431 and SK-MEL-30 cells, respectively). CCNB1 is a key regulator of the cell cycle at the G2/M transition for cell division (detected in U-2-OS cells). The target protein is shown in green, microtubules in red, and the nucleus in blue.

Single-cell variation is most commonly observed for proteins localized to the nucleus, cytosol, nucleoli and mitochondria (Figure 2). Gene Ontology (GO)-based enrichment analysis of genes encoding proteins with single-cell variation at protein level reveals an enrichment of GO terms describing processes associated with cellular responses to various extracellular stimuli, apoptosis, cell differentiation, cell cycle progression and metabolism (Figure 3).

Figure 2. Localizations of proteins showing single-cell variations to the different organelles, grouped by meta-compartments.

Figure 3. Gene Ontology-based enrichment analysis for genes encoding proteins with single-cell variations, showing the significantly enriched terms for the GO domain Biological Process. Each bar is clickable and gives a search result of proteins that belong to the selected category.

Interphase proteogenomics in single cells

Previous studies of transcript and protein abundance in different phases of the human cell cycle have revealed variations in the expression of 400-1,200 genes (Cho RJ et al. (2001); Whitfield ML et al. (2002); Boström J et al. (2017)) and 300-700 proteins (Lane KR et al. (2013); Ohta S et al. (2010); Ly T et al. (2014); Pagliuca FW et al. (2011); Ly T et al. (2015)). However, cell synchronization is known to alter gene expression (Cooper S et al. (2007)), cell morphology and metabolism (Davis PK et al. (2001)), and precludes the discovery of expression changes within cell cycle phases. The use of single-cell RNA sequencing has allowed the analysis of transcriptional changes without the need for synchronization and has enabled the discovery of additional cell cycle regulated genes (Domenighetti G et al. (1988); Scialdone A et al. (2015)). However, studies of cell cycle dependent (CCD) variations in protein expression at single-cell level have been lacking due to technological limitations.

The HPA Cell Atlas now includes a targeted single-cell transcriptomic analysis, as well as proteomic imaging (i.e., imaging proteogenomics, Figure 4) of 1149 proteins that show single-cell variability in the Cell Atlas and that are expressed in FUCCI U-2 OS cells (Sakaue-Sawano A et al. (2008)). This cell line expresses a pair of fluorescently tagged marker proteins, Cdt1 tagged with red fluorescent protein (RFP) and Geminin tagged with green fluorescent protein (GFP), which enable visualization of interphase progression in individual cells. The intensities of the RFP- and GFP-tagged cell cycle markers can be used to create a linear representation of cell cycle pseudo time, enabling protein and RNA expression in individual cells to be plotted along an axis representing progression through interphase.


Figure 4. Schematic overview of the single-cell imaging proteogenomic workflow. U-2 OS FUCCI cells express two fluorescently tagged cell cycle markers, CDT1 during G1 phase (red, RFP-tagged) and Geminin during S and G2 phases (green, GFP-tagged); these markers are co-expressed during the G1-S transition (yellow). By fitting a polar model to the red and green fluorescence intensities, a linear representation of cell cycle pseudotime is obtained. Independent measurements of RNA and protein expression are compared after pseudotime alignment of individual cells.

The single-cell RNA-sequencing data from the FUCCI U-2 OS cells enables analysis of RNA abundance in relation to cell cycle progression. Upon analysis of 13,450 protein-coding genes expressed in FUCCI U-2 OS, 401 genes (3%) show variance in RNA expression levels that correlate to cell cycle progression.

For the single-cell proteomic imaging analysis, 318 proteins display variation in protein expression levels that temporally correlate with interphase progression through G1, S and G2. The cell cycle dependent (CCD) proteins include known cell cycle regulators, such as the cyclin CCNB1 and ANLN, which is required for cytokinesis, but also novel CCD proteins, such as SCIN and DUSP18 (Figure 5). However, most proteins (831) show cell-to-cell variations that are largely unexplained by cell cycle progression (non-CCD). This opens up intriguing avenues for further exploration of the stochasticity or deterministic factors that govern these variations, as well as the role of spatiotemporal proteome dynamics for regulating other cellular states and functions.

 
CCNB1 - Protein expression
 
CCNB1 - Protein expression
 
CCNB1 - Rna expression

 
ANLN - Protein expression
 
ANLN - Protein expression
 
ANLN - Rna expression

 
SCIN - Protein expression
 
SCIN - Protein expression
 
SCIN - Rna expression

 
DUSP18 - Protein expression
 
DUSP18 - Protein expression
 
DUSP18 - Rna expression