All transcripts of all genes have been analyzed regarding the location(s) of corresponding protein based on prediction methods for signal peptides and transmembrane regions.
Genes with at least one transcript predicted to encode a secreted protein, according to prediction methods or to UniProt location data, have been further annotated and classified with the aim to determine if the corresponding protein(s) are secreted or actually retained in intracellular locations or membrane-attached.
Remaining genes, with no transcript predicted to encode a secreted protein, will be assigned the prediction-based location(s).
The annotated location overrules the predicted location, so that a gene encoding a predicted secreted protein that has been annotated as intracellular will have intracellular as the final location.
Number of protein-coding transcripts from the gene as defined by Ensembl.
HUMAN PROTEIN ATLAS INFORMATIONi
Summary of data presented in the Pathology Atlas , with representative images of protein expression in cancer (left) and correlation between mRNA expression and patient survival (right). Images are clickable and redirect to pages with more Pathology Atlas data. The Pathology Atlas contains mRNA and protein expression data from 17 different forms of human cancer, as well as correlation analysis of mRNA expression and patient survival. The protein expression data is derived from antibody-based protein profiling using immunohistochemistry.
A summary of RNA categories for human tissues, cell lines and cancer tissues. Categories for RNA specificity include tissue enriched, group enriched, tissue enhanced, low tissue specificity and not detected. Categories for RNA distribution include detected in single, detected in some, detected in many, detected in all and not detected.
Human tissue RNA category is based on the consensus dataset, which is a combination of RNA data from human tissues from three sources: HPA, GTEX and FANTOM5. Cell line RNA category is based on RNA data from cells lines from HPA dataset. More information can be found about the normalization and classification of these datasets.
Cancer tissue RNA category is based on RNA data from the The Cancer Genome Atlas (TCGA), categorized in the same way as human tissues and cell lines.
Group enriched (brain, lung, retina) Detected in many
HPA (cell line):
Low cell line specificity Detected in many
TCGA (cancer tissue):
Low cancer specificity Detected in all
Evidence score for genes based on UniProt protein existence (UniProt evidence); a Human Protein Atlas antibody- or RNA based score (HPA evidence); and evidence based on PeptideAtlas (MS evidence). The avaliable scores are evidence at protein level, evidence at transcript level, no evidence, or not avaliable.
A summary of the overall protein expression pattern across the analyzed normal tissues. The summary is based on knowledge-based annotation.
"Estimation of protein expression could not be performed. View primary data." is shown for genes analyzed with a knowledge-based approach where available RNA-seq and gene/protein characterization data has been evaluated as not sufficient in combination with immunohistochemistry data to yield a reliable estimation of the protein expression profile.
Nuclear and membrane expression in subsets of endothelial cells and peripheral nerves. Membrane expression in lens fibers cells.
IMMUNOHISTOCHEMISTRY DATA RELIABILITY
Data reliability descriptioni
Standardized explanatory sentences with additional information required for full understanding of the protein expression profile, based on knowledge-based and secretome-based annotation.
High consistency between antibody staining and RNA expression data. Antibody staining in cells/structures not annotated, view images.
Reliability score - normal tissuesi
A reliability score is manually set for all genes and indicates the level of reliability of the analyzed protein expression pattern based on available RNA-seq data, protein/gene characterization data and immunohistochemical data from one or several antibodies with non-overlapping epitopes. The reliability score is based on the 44 normal tissues analyzed, and if there is available data from more than one antibody, the staining patterns of all antibodies are taken into consideration during evaluation.
The reliability score is divided into Enhanced, Supported, Approved, or Uncertain, and is displayed on both Tissue Atlas and Pathology Atlas.
Kaplan-Meier plots for all cancers where high expression of this gene has significant (p<0.001) association with patient survival are shown in this summary. Whether the prognosis is favorable or unfavorable is indicated in brackets. Each Kaplan-Meier plot is clickable and redirects to a detailed page that includes individual expression and survival data for patients with the selected cancer.
RNA expression overview shows RNA-seq data from The Cancer Genome Atlas (TCGA).
RNA-seq data in 17 cancer types are reported as median FPKM (number Fragments Per Kilobase of exon per Million reads), generated by the The Cancer Genome Atlas (TCGA). RNA cancer tissue category is calculated based on mRNA expression levels across all 17 cancer tissues and include: cancer tissue enriched, cancer group enriched, cancer tissue enhanced, expressed in all, mixed and not detected.Normal distribution across the dataset is visualized with box plots, shown as median and 25th and 75th percentiles. Points are displayed as outliers if they are above or below 1.5 times the interquartile range. To access cancer specific RNA and prognostic data, click on the cancer name. The cancer types are color-coded according to which type of normal organ the cancer originates from.
Antibody staining in 20 different cancers is summarized by a selection of four standard cancer tissue samples representative of the overall staining pattern. From left: colorectal cancer, breast cancer, prostate cancer and lung cancer. An additional fifth image can be added as a complement. The assay and annotation is described here. Note that samples used for immunohistochemistry by the Human Protein Atlas do not correspond to samples in the TCGA dataset.
For each cancer, color-coded bars indicate the percentage of patients (maximum 12 patients) with high and medium protein expression level. The cancer types are color-coded according to which type of normal organ the cancer originates from. Low or not detected protein expression results in a white bar. Mouse-over function shows details about expression level and normal tissue of origin. The images and annotations can be accessed by clicking on the cancer name or protein expression bar. If more than one antibody is analyzed, the tabs at the top of the staining summary section can be used to toggle between the different antibodies.