Secreted proteins with unknown location

There are 173 proteins which are possibly secreted or have secreted isoforms, but for which local or systemic secretion can not be determined due to lack of data, data with low reliability or conflicting data, or where the existence of a secreted form is dubious.

Functions of secreted proteins with unknown location

In Figure 1 the function of these proteins have been annotated, based mainly on Uniprot molecular function and biological process keywords, to a single function term according to the following hierarchy: Blood coagulation, Complement pathway, Acute phase, Cytokine, Hormone, Neuropeptide, Growth factor, Receptor, Lectin, Transport, Developmental protein, Defence, Enzyme, Enzyme inhibitor, Transcription, Immunity, Cell adhesion. Other Uniprot terms to which only a few proteins were annotated were grouped under the term Other. According to these criteria, less than half (48%) of these genes encode proteins with a known function according to Uniprot keywords. The largest groups are enzymes (n= 35) including some potential phospholipases and serine proteases, and receptors (n= 12) with possibly secreted isoforms.

Figure 1. Functions for the secreted proteins with unknown location. Annotation was based on Uniprot molecular function and biological processes keywords. Each bar is clickable and gives a search result of proteins that belong to the selected category.

Tissue specificity classification

The genes were further analyzed based on mRNA expression and categorized according to tissue specificity and tissue distribution. Only 14% of the genes were classified as tissue -or group enriched, i.e. having either at least five-fold higher mRNA level in one tissue or in a group of two to five tissues compared to all other tissues. For 5 genes RNA expression was not detected in any of the analyzed samples from HPA, GTEx or FANTOM (Fig. 2). Regarding expression patterns across all tissues, more than half of the genes show a less tissue restricted expression and were detected in more than 30% of the tissues, with 28% being expressed in all analyzed tissues (Fig. 3).

Figure 2. Tissue specificity for the genes classified as secreted with unknown location. Categories include: tissue enriched, defined as mRNA level in one tissue at least five-fold higher than in all other tissues; group enriched, defined as five-fold higher average mRNA level in a group of two to five tissues all other tissues; tissue enhanced, defines as five-fold higher average mRNA level in one or more tissues compared to the mean mRNA level of all tissues; expressed in all, defined as ≥ 1 NX in all tissues; and not detected, defined as < 1 NX in all tissues.

Figure 3. The genes classified as secreted with unknown location, categorized according to tissue distribution. Categories include: detected in all, defined as n=100%; detected in many, defined as 31%=< n <100%; detected in some, defined as 1< n <31%; detected in single defined as single n=1; and not detected, n=0.

Origin of secreted proteins with unknown location

The tissue of origin according to RNAseq data for the 5 proteins found to be tissue enriched was investigated, and the result is shown in Figure 4.

Figure 4. The tissue origin, according to the highest mRNA level, of tissue enriched genes belonging to the secreted with unknown location category. Each bar is clickable and gives a search result of proteins that belong to the selected category.