All antibodies generated by the Human Protein Atlas project are affinity purified and have been approved to selectively bind its antigen in protein array assays. Despite this quality control, it cannot be excluded that some antibodies in certain applications may not bind the intended target. Due to differences in protein conformation and target accessibility, antibodies may perform differently depending on context and application. Factors such as protein denaturation, protein concentration and sample complexity may influence off-target binding events, and could potentially lead to false results. For this reason, the reliability of the results generated by all assays in the Human Protein Atlas are individually assessed and scored.
Quality assured antibodies have been used in this study and each image has been evaluated by specially educated personnel. However, the complexity of tissues and lack of verified references for largely unknown proteins disables immunoreactivity as firm proof of protein expression levels. It cannot be excluded that certain observed and annotated differences in immunoreactivity are due to technical rather than biological reasons. In addition, inter-individual differences regarding both expression patterns and image annotation may play a role.
Also note that antibody-based immunohistochemistry can result in off-target binding yielding false positive results. This has been taken into consideration when performing knowledge-based annotation and summarizing the protein expression with explanatory sentences. The protein expression levels presented on the summary page are the result of manual correction based on available RNA-seq data and protein/gene characterization data. Staining marked as presumed off-target binding should be interpreted with caution and one of the future objectives of the Human Protein Atlas program is to resolve if the discrepancies between RNA and protein levels for these genes are due to technical issues.
In the immunofluorescent analysis of the subcellular distribution of proteins, high-resolution images of a limited number of cells are acquired. The images are single slice images representing one optical section in the cells. It cannot therefore be excluded that there are additional staining localizations not captured/represented in the images.
It can also not be excluded that differences in intensities and localization are due to technical rather than biological reasons. One method for fixation and permeabilization of the cells is currently being used and this method may be more or less suitable for different types of proteins. Some types of proteins will not easily be resolved in their intact subcellular compartment and hence fall into a less resolved compartment, i.e. cytoplasm or nucleus, possibly resulting in an over-representation of these "meta"-compartments.
Western blot analysis has been performed using a routine setup. This setup is composed of total protein extracts from a limited number of tissues/cells and human plasma depleted of serum albumin and IgG. The lack of a verified positive reference for many of the analyzed antibodies and the limited number of included protein sources sometimes excludes Western blot as firm proof of antibody specificity. In addition, due to the high-throughput nature of the project, the majority of the antibodies have been analyzed using a standardized protocol in a single shot approach without further efforts to optimize the procedure. Therefore, it cannot be excluded that certain observed binding properties are due to technical rather than biological reasons and that further optimization could result in a different outcome. In specific cases antibodies with an uncertain standard western blot have been revalidated using an over-expression HEK293T lysate. As an additional validation method for the Human Protein Atlas antibody, Western blot has also been performed on lysates from siRNA transfected U-2 OS cells.
The transcriptomics data is based on deep sequencing of RNA libraries. The library preparation and data analysis will unavoidably introduce biases and errors for a small number of genes. While these errors are rare, it is important to take notice when studying affected genes, since they will cause discrepancies between presented RNA and protein data. The library preparation method will not capture non-polyadenylated transcripts. This mainly affects histone genes, which incorrectly appear to not be expressed at the RNA level.
Finally, we have detected a minor leakage (in the order of 0.01-0.1%) between samples that were multiplexed in the sequencing. Due to the limited extent of this leakage its effect is very minor, but genes with high and specific expression in one sample will, erroneously, appear to have low levels of expression in some other samples. This leakage affects most experiments on the Illumina platform and has been previously described (Kircher M et al. (2012)).
Note that cancer patients included in the Cancer Genome Atlas have been selected based on "convenience of collection" and thus do not fully represent consecutive or other forms of cancer patient cohorts epidemiologically designed to analyze prognosic factors. This means that the lists favorable and unfavorable genes presented in the Pathology Atlas for the different forms of cancer need to be validated in independent cohorts before being accepted as clinically relevant.
The presented patient survival data is based on overall survival (OS) and does not truly reflect cancer-specific death among patients. Furtermore, there has been no consideration as to how patients have received different forms of cancer treatment, which also may have effects on patient survival time.