- Analysis of the Human Protein Atlas Image Classification competition (vol 16, pg 1254, 2019) NATURE METHODS 2020; 17 (1): 115
- Spatial proteomics: a powerful discovery tool for cell biology NATURE REVIEWS MOLECULAR CELL BIOLOGY 2019; 20 (5): 285–302
Analysis of the Human Protein Atlas Image Classification competition.
2019; 16 (12): 1254–61
Pinpointing subcellular protein localizations from microscopy images is easy to the trained eye, but challenging to automate. Based on the Human Protein Atlas image collection, we held a competition to identify deep learning solutions to solve this task. Challenges included training on highly imbalanced classes and predicting multiple labels per image. Over 3 months, 2,172 teams participated. Despite convergence on popular networks and training techniques, there was considerable variety among the solutions. Participants applied strategies for modifying neural networks and loss functions, augmenting data and using pretrained networks. The winning models far outperformed our previous effort at multi-label classification of protein localization patterns by ~20%. These models can be used as classifiers to annotate new images, feature extractors to measure pattern similarity or pretrained networks for a wide range of biological applications.
View details for DOI 10.1038/s41592-019-0658-6
View details for PubMedID 31780840
- ImJoy: an open-source computational platform for the deep learning era. Nature methods 2019; 16 (12): 1199–1200
Deep learning is combined with massive-scale citizen science to improve large-scale image classification
2018; 36 (9): 820-+
Pattern recognition and classification of images are key challenges throughout the life sciences. We combined two approaches for large-scale classification of fluorescence microscopy images. First, using the publicly available data set from the Cell Atlas of the Human Protein Atlas (HPA), we integrated an image-classification task into a mainstream video game (EVE Online) as a mini-game, named Project Discovery. Participation by 322,006 gamers over 1 year provided nearly 33 million classifications of subcellular localization patterns, including patterns that were not previously annotated by the HPA. Second, we used deep learning to build an automated Localization Cellular Annotation Tool (Loc-CAT). This tool classifies proteins into 29 subcellular localization patterns and can deal efficiently with multi-localization proteins, performing robustly across different cell types. Combining the annotations of gamers and deep learning, we applied transfer learning to create a boosted learner that can characterize subcellular protein distribution with F1 score of 0.72. We found that engaging players of commercial computer games provided data that augmented deep learning and enabled scalable and readily improved image classification.
View details for PubMedID 30125267
A pathology atlas of the human cancer transcriptome.
Science (New York, N.Y.)
2017; 357 (6352)
Cancer is one of the leading causes of death, and there is great interest in understanding the underlying molecular mechanisms involved in the pathogenesis and progression of individual tumors. We used systems-level approaches to analyze the genome-wide transcriptome of the protein-coding genes of 17 major cancer types with respect to clinical outcome. A general pattern emerged: Shorter patient survival was associated with up-regulation of genes involved in cell growth and with down-regulation of genes involved in cellular differentiation. Using genome-scale metabolic models, we show that cancer patients have widespread metabolic heterogeneity, highlighting the need for precise and personalized medicine for cancer treatment. All data are presented in an interactive open-access database (www.proteinatlas.org/pathology) to allow genome-wide exploration of the impact of individual proteins on clinical outcomes.
View details for DOI 10.1126/science.aan2507
View details for PubMedID 28818916
A subcellular map of the human proteome.
Science (New York, N.Y.)
2017; 356 (6340)
Resolving the spatial distribution of the human proteome at a subcellular level can greatly increase our understanding of human biology and disease. Here we present a comprehensive image-based map of subcellular protein distribution, the Cell Atlas, built by integrating transcriptomics and antibody-based immunofluorescence microscopy with validation by mass spectrometry. Mapping the in situ localization of 12,003 human proteins at a single-cell level to 30 subcellular structures enabled the definition of the proteomes of 13 major organelles. Exploration of the proteomes revealed single-cell variations in abundance or spatial distribution and localization of about half of the proteins to multiple compartments. This subcellular map can be used to refine existing protein-protein interaction networks and provides an important resource to deconvolute the highly complex architecture of the human cell.
View details for DOI 10.1126/science.aal3321
View details for PubMedID 28495876
- A proposal for validation of antibodies NATURE METHODS 2016; 13 (10): 823-?
Proteomics. Tissue-based map of the human proteome.
Science (New York, N.Y.)
2015; 347 (6220): 1260419
Resolving the molecular details of proteome variation in the different tissues and organs of the human body will greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level. Our tissue-based analysis detected more than 90% of the putative protein-coding genes. We used this approach to explore the human secretome, the membrane proteome, the druggable proteome, the cancer proteome, and the metabolic functions in 32 different tissues and organs. All the data are integrated in an interactive Web-based database that allows exploration of individual proteins, as well as navigation of global expression patterns, in all major tissues and organs in the human body.
View details for DOI 10.1126/science.1260419
View details for PubMedID 25613900
Spatial Characterization of the Human Centrosome Proteome Opens up New Horizons for a Small but Versatile Organelle.
After a century of research, the human centrosome continues to fascinate. Based on immunofluorescence and confocal microscopy, we present an extensive inventory of the protein components of the human centrosome, and the centriolar satellites, with the important contribution of over 300 novel proteins localizing to these compartments. We identify a network of candidate centrosome proteins involved in ubiquitination, including six interaction partners of the Kelch-like protein 21, and an additional network of protein phosphatases, together supporting the suggested role of the centrosome as an interactive hub for cell signaling. Analysis of multi-localization across cellular organelles analyzed within the Human Protein Atlas project shows how multi-localizing proteins are particularly overrepresented in centriolar satellites, supporting the dynamic nature and wide range of functions for this compartment. In summary, the spatial dissection of the human centrosome and centriolar satellites described here provides a comprehensive knowledgebase for further exploration of their proteomes. Significance Statement: Exploring the constituents of organellar proteomes lays a foundation for detailed understanding of the cellular functions that take place in these settings. We have come to understand that the molecular environments in which cellular processes take place are complex and highly dynamic, with multiple modes of cross-talk within and between cellular compartments. One goal of the Human Protein Atlas project is a systematic mapping of the expression and subcellular localization of all proteins in human cells, using RNA sequencing combined with immunofluorescence and high-resolution confocal microscopy. Here we present an overview and further analysis of the proteins that localizes to centrosomes and centriolar satellites, expanding these organellar proteomes with more than 300 candidate proteins. This provides new insights to the characteristics and functions of these compartments, and provides an important knowledge-base for further studies of cellular processes that take place at centrosomes and in centriolar satellites. This article is protected by copyright. All rights reserved.
View details for DOI 10.1002/pmic.201900361
View details for PubMedID 32558245
- Voices in methods development. Nature methods 2019; 16 (10): 945–51
The human secretome.
2019; 12 (609)
The proteins secreted by human cells (collectively referred to as the secretome) are important not only for the basic understanding of human biology but also for the identification of potential targets for future diagnostics and therapies. Here, we present a comprehensive analysis of proteins predicted to be secreted in human cells, which provides information about their final localization in the human body, including the proteins actively secreted to peripheral blood. The analysis suggests that a large number of the proteins of the secretome are not secreted out of the cell, but instead are retained intracellularly, whereas another large group of proteins were identified that are predicted to be retained locally at the tissue of expression and not secreted into the blood. Proteins detected in the human blood by mass spectrometry-based proteomics and antibody-based immunoassays are also presented with estimates of their concentrations in the blood. The results are presented in an updated version 19 of the Human Protein Atlas in which each gene encoding a secretome protein is annotated to provide an open-access knowledge resource of the human secretome, including body-wide expression data, spatial localization data down to the single-cell and subcellular levels, and data about the presence of proteins that are detectable in the blood.
View details for DOI 10.1126/scisignal.aaz0274
View details for PubMedID 31772123
Experimental validation of predicted cancer genes using FRET.
Methods and applications in fluorescence
2018; 6 (3): 035007
Huge amounts of data are generated in genome wide experiments, designed to investigate diseases with complex genetic causes. Follow up of all potential leads produced by such experiments is currently cost prohibitive and time consuming. Gene prioritization tools alleviate these constraints by directing further experimental efforts towards the most promising candidate targets. Recently a gene prioritization tool called MaxLink was shown to outperform other widely used state-of-the-art prioritization tools in a large scale in silico benchmark. An experimental validation of predictions made by MaxLink has however been lacking. In this study we used Fluorescence Resonance Energy Transfer, an established experimental technique for detection of protein-protein interactions, to validate potential cancer genes predicted by MaxLink. Our results provide confidence in the use of MaxLink for selection of new targets in the battle with polygenic diseases.
View details for DOI 10.1088/2050-6120/aab932
View details for PubMedID 29570091
Seeing More: A Future of Augmented Microscopy.
2018; 173 (3): 546-548
Microscope images are information rich. In this issue of Cell, Christiansen et al. show that label-free images of cells can be used to predict fluorescent labels representing cell type, state, and organelle distribution using a deep-learning framework. This paves the way for computationally multiplexed assays derived from inexpensive label-free microscopy.
View details for DOI 10.1016/j.cell.2018.04.003
View details for PubMedID 29677507
Transcriptome profiling of the interconnection of pathways involved in malignant transformation and response to hypoxia.
2018; 9 (28): 19730-19744
In tumor tissues, hypoxia is a commonly observed feature resulting from rapidly proliferating cancer cells outgrowing their surrounding vasculature network. Transformed cancer cells are known to exhibit phenotypic alterations, enabling continuous proliferation despite a limited oxygen supply. The four-step isogenic BJ cell model enables studies of defined steps of tumorigenesis: the normal, immortalized, transformed, and metastasizing stages. By transcriptome profiling under atmospheric and moderate hypoxic (3% O2) conditions, we observed that despite being highly similar, the four cell lines of the BJ model responded strikingly different to hypoxia. Besides corroborating many of the known responses to hypoxia, we demonstrate that the transcriptome adaptation to moderate hypoxia resembles the process of malignant transformation. The transformed cells displayed a distinct capability of metabolic switching, reflected in reversed gene expression patterns for several genes involved in oxidative phosphorylation and glycolytic pathways. By profiling the stage-specific responses to hypoxia, we identified ASS1 as a potential prognostic marker in hypoxic tumors. This study demonstrates the usefulness of the BJ cell model for highlighting the interconnection of pathways involved in malignant transformation and hypoxic response.
View details for DOI 10.18632/oncotarget.24808
View details for PubMedID 29731978
View details for PubMedCentralID PMC5929421
CEP128 Localizes to the Subdistal Appendages of the Mother Centriole and Regulates TGF-β/BMP Signaling at the Primary Cilium.
2018; 22 (10): 2584-2592
The centrosome is the main microtubule-organizing center in animal cells and comprises a mother and daughter centriole surrounded by pericentriolar material. During formation of primary cilia, the mother centriole transforms into a basal body that templates the ciliary axoneme. Ciliogenesis depends on mother centriole-specific distal appendages, whereas the role of subdistal appendages in ciliary function is unclear. Here, we identify CEP128 as a centriole subdistal appendage protein required for regulating ciliary signaling. Loss of CEP128 did not grossly affect centrosomal or ciliary structure but caused impaired transforming growth factor-β/bone morphogenetic protein (TGF-β/BMP) signaling in zebrafish and at the primary cilium in cultured mammalian cells. This phenotype is likely the result of defective vesicle trafficking at the cilium as ciliary localization of RAB11 was impaired upon loss of CEP128, and quantitative phosphoproteomics revealed that CEP128 loss affects TGF-β1-induced phosphorylation of multiple proteins that regulate cilium-associated vesicle trafficking.
View details for DOI 10.1016/j.celrep.2018.02.043
View details for PubMedID 29514088
GeneGini: Assessment via the Gini Coefficient of Reference "Housekeeping" Genes and Diverse Human Transporter Expression Profiles.
2018; 6 (2): 230-244.e1
The expression levels of SLC or ABC membrane transporter transcripts typically differ 100- to 10,000-fold between different tissues. The Gini coefficient characterizes such inequalities and here is used to describe the distribution of the expression of each transporter among different human tissues and cell lines. Many transporters exhibit extremely high Gini coefficients even for common substrates, indicating considerable specialization consistent with divergent evolution. The expression profiles of SLC transporters in different cell lines behave similarly, although Gini coefficients for ABC transporters tend to be larger in cell lines than in tissues, implying selection. Transporter genes are significantly more heterogeneously expressed than the members of most non-transporter gene classes. Transcripts with the stablest expression have a low Gini index and often differ significantly from the "housekeeping" genes commonly used for normalization in transcriptomics/qPCR studies. PCBP1 has a low Gini coefficient, is reasonably expressed, and is an excellent novel reference gene. The approach, referred to as GeneGini, provides rapid and simple characterization of expression-profile distributions and improved normalization of genome-wide expression-profiling data.
View details for DOI 10.1016/j.cels.2018.01.003
View details for PubMedID 29428416
View details for PubMedCentralID PMC5840522
How many human proteoforms are there?
Nature chemical biology
2018; 14 (3): 206–14
Despite decades of accumulated knowledge about proteins and their post-translational modifications (PTMs), numerous questions remain regarding their molecular composition and biological function. One of the most fundamental queries is the extent to which the combinations of DNA-, RNA- and PTM-level variations explode the complexity of the human proteome. Here, we outline what we know from current databases and measurement strategies including mass spectrometry-based proteomics. In doing so, we examine prevailing notions about the number of modifications displayed on human proteins and how they combine to generate the protein diversity underlying health and disease. We frame central issues regarding determination of protein-level variation and PTMs, including some paradoxes present in the field today. We use this framework to assess existing data and to ask the question, "How many distinct primary structures of proteins (proteoforms) are created from the 20,300 human genes?" We also explore prospects for improving measurements to better regularize protein-level biology and efficiently associate PTMs to function and phenotype.
View details for PubMedID 29443976
Comparative cell cycle transcriptomics reveals synchronization of developmental transcription factor networks in cancer cells.
2017; 12 (12): e0188772
The cell cycle coordinates core functions such as replication and cell division. However, cell-cycle-regulated transcription in the control of non-core functions, such as cell identity maintenance through specific transcription factors (TFs) and signalling pathways remains unclear. Here, we provide a resource consisting of mapped transcriptomes in unsynchronized HeLa and U2OS cancer cells sorted for cell cycle phase by Fucci reporter expression. We developed a novel algorithm for data analysis that enables efficient visualization and data comparisons and identified cell cycle synchronization of Notch signalling and TFs associated with development. Furthermore, the cell cycle synchronizes with the circadian clock, providing a possible link between developmental transcriptional networks and the cell cycle. In conclusion we find that cell cycle synchronized transcriptional patterns are temporally compartmentalized and more complex than previously anticipated, involving genes, which control cell identity and development.
View details for DOI 10.1371/journal.pone.0188772
View details for PubMedID 29228002
View details for PubMedCentralID PMC5724894
The Human Cell Atlas
The recent advent of methods for high-throughput single-cell molecular profiling has catalyzed a growing sense in the scientific community that the time is ripe to complete the 150-year-old effort to identify all cell types in the human body. The Human Cell Atlas Project is an international collaborative effort that aims to define all human cell types in terms of distinctive molecular profiles (such as gene expression profiles) and to connect this information with classical cellular descriptions (such as location and morphology). An open comprehensive reference map of the molecular state of cells in healthy human tissues would propel the systematic study of physiological states, developmental trajectories, regulatory circuitry and interactions of cells, and also provide a framework for understanding cellular dysregulation in human disease. Here we describe the idea, its potential utility, early proofs-of-concept, and some design considerations for the Human Cell Atlas, including a commitment to open data, code, and community.
View details for PubMedID 29206104
Progress on the HUPO Draft Human Proteome: 2017 Metrics of the Human Proteome Project.
Journal of proteome research
2017; 16 (12): 4281-4287
The Human Proteome Organization (HUPO) Human Proteome Project (HPP) continues to make progress on its two overall goals: (1) completing the protein parts list, with an annual update of the HUPO draft human proteome, and (2) making proteomics an integrated complement to genomics and transcriptomics throughout biomedical and life sciences research. neXtProt version 2017-01-23 has 17 008 confident protein identifications (Protein Existence [PE] level 1) that are compliant with the HPP Guidelines v2.1 ( https://hupo.org/Guidelines ), up from 13 664 in 2012-12 and 16 518 in 2016-04. Remaining to be found by mass spectrometry and other methods are 2579 "missing proteins" (PE2+3+4), down from 2949 in 2016. PeptideAtlas 2017-01 has 15 173 canonical proteins, accounting for nearly all of the 15 290 PE1 proteins based on MS data. These resources have extensive data on PTMs, single amino acid variants, and splice isoforms. The Human Protein Atlas v16 has 10 492 highly curated protein entries with tissue and subcellular spatial localization of proteins and transcript expression. Organ-specific popular protein lists have been generated for broad use in quantitative targeted proteomics using SRM-MS or DIA-SWATH-MS studies of biology and disease.
View details for DOI 10.1021/acs.jproteome.7b00375
View details for PubMedID 28853897
View details for PubMedCentralID PMC5872831
A comprehensive structural, biochemical and biological profiling of the human NUDIX hydrolase family
2017; 8: 1541
The NUDIX enzymes are involved in cellular metabolism and homeostasis, as well as mRNA processing. Although highly conserved throughout all organisms, their biological roles and biochemical redundancies remain largely unclear. To address this, we globally resolve their individual properties and inter-relationships. We purify 18 of the human NUDIX proteins and screen 52 substrates, providing a substrate redundancy map. Using crystal structures, we generate sequence alignment analyses revealing four major structural classes. To a certain extent, their substrate preference redundancies correlate with structural classes, thus linking structure and activity relationships. To elucidate interdependence among the NUDIX hydrolases, we pairwise deplete them generating an epistatic interaction map, evaluate cell cycle perturbations upon knockdown in normal and cancer cells, and analyse their protein and mRNA expression in normal and cancer tissues. Using a novel FUSION algorithm, we integrate all data creating a comprehensive NUDIX enzyme profile map, which will prove fundamental to understanding their biological functionality.
View details for PubMedID 29142246
Proteomic analysis of cell cycle progression in asynchronous cultures, including mitotic subphases, using PRIMMUS.
The temporal regulation of protein abundance and post-translational modifications is a key feature of cell division. Recently, we analysed gene expression and protein abundance changes during interphase under minimally perturbed conditions (Ly et al., 2014, 2015). Here, we show that by using specific intracellular immunolabelling protocols, FACS separation of interphase and mitotic cells, including mitotic subphases, can be combined with proteomic analysis by mass spectrometry. Using this PRIMMUS (PRoteomic analysis of Intracellular iMMUnolabelled cell Subsets) approach, we now compare protein abundance and phosphorylation changes in interphase and mitotic fractions from asynchronously growing human cells. We identify a set of 115 phosphorylation sites increased during G2, termed 'early risers'. This set includes phosphorylation of S738 on TPX2, which we show is important for TPX2 function and mitotic progression. Further, we use PRIMMUS to provide the first a proteome-wide analysis of protein abundance remodeling between prophase, prometaphase and anaphase.
View details for DOI 10.7554/eLife.27574
View details for PubMedID 29052541
View details for PubMedCentralID PMC5650473
RhoA knockout fibroblasts lose tumor-inhibitory capacity in vitro and promote tumor growth in vivo.
Proceedings of the National Academy of Sciences of the United States of America
2017; 114 (8): E1413-E1421
Fibroblasts are a main player in the tumor-inhibitory microenvironment. Upon tumor initiation and progression, fibroblasts can lose their tumor-inhibitory capacity and promote tumor growth. The molecular mechanisms that underlie this switch have not been defined completely. Previously, we identified four proteins overexpressed in cancer-associated fibroblasts and linked to Rho GTPase signaling. Here, we show that knocking out the Ras homolog family member A (RhoA) gene in normal fibroblasts decreased their tumor-inhibitory capacity, as judged by neighbor suppression in vitro and accompanied by promotion of tumor growth in vivo. This also induced PC3 cancer cell motility and increased colony size in 2D cultures. RhoA knockout in fibroblasts induced vimentin intermediate filament reorganization, accompanied by reduced contractile force and increased stiffness of cells. There was also loss of wide F-actin stress fibers and large focal adhesions. In addition, we observed a significant loss of α-smooth muscle actin, which indicates a difference between RhoA knockout fibroblasts and classic cancer-associated fibroblasts. In 3D collagen matrix, RhoA knockout reduced fibroblast branching and meshwork formation and resulted in more compactly clustered tumor-cell colonies in coculture with PC3 cells, which might boost tumor stem-like properties. Coculturing RhoA knockout fibroblasts and PC3 cells induced expression of proinflammatory genes in both. Inflammatory mediators may induce tumor cell stemness. Network enrichment analysis of transcriptomic changes, however, revealed that the Rho signaling pathway per se was significantly triggered only after coculturing with tumor cells. Taken together, our findings in vivo and in vitro indicate that Rho signaling governs the inhibitory effects by fibroblasts on tumor-cell growth.
View details for DOI 10.1073/pnas.1621161114
View details for PubMedID 28174275
View details for PubMedCentralID PMC5338371
Antibody Validation in Bioimaging Applications Based on Endogenous Expression of Tagged Proteins.
Journal of proteome research
2017; 16 (1): 147-155
Antibodies are indispensible research tools, yet the scientific community has not adopted standardized procedures to validate their specificity. Here we present a strategy to systematically validate antibodies for immunofluorescence (IF) applications using gene tagging. We have assessed the on- and off-target binding capabilities of 197 antibodies using 108 cell lines expressing EGFP-tagged target proteins at endogenous levels. Furthermore, we assessed batch-to-batch effects for 35 target proteins, showing that both the on- and off-target binding patterns vary significantly between antibody batches and that the proposed strategy serves as a reliable procedure for ensuring reproducibility upon production of new antibody batches. In summary, we present a systematic scheme for antibody validation in IF applications using endogenous expression of tagged proteins. This is an important step toward a reproducible approach for context- and application-specific antibody validation and improved reliability of antibody-based experiments and research data.
View details for DOI 10.1021/acs.jproteome.6b00821
View details for PubMedID 27723985
The endosomal transcriptional regulator RNF11 integrates degradation and transport of EGFR.
The Journal of cell biology
2016; 215 (4): 543-558
Stimulation of cells with epidermal growth factor (EGF) induces internalization and partial degradation of the EGF receptor (EGFR) by the endo-lysosomal pathway. For continuous cell functioning, EGFR plasma membrane levels are maintained by transporting newly synthesized EGFRs to the cell surface. The regulation of this process is largely unknown. In this study, we find that EGF stimulation specifically increases the transport efficiency of newly synthesized EGFRs from the endoplasmic reticulum to the plasma membrane. This coincides with an up-regulation of the inner coat protein complex II (COPII) components SEC23B, SEC24B, and SEC24D, which we show to be specifically required for EGFR transport. Up-regulation of these COPII components requires the transcriptional regulator RNF11, which localizes to early endosomes and appears additionally in the cell nucleus upon continuous EGF stimulation. Collectively, our work identifies a new regulatory mechanism that integrates the degradation and transport of EGFR in order to maintain its physiological levels at the plasma membrane.
View details for DOI 10.1083/jcb.201601090
View details for PubMedID 27872256
View details for PubMedCentralID PMC5119934
Metrics for the Human Proteome Project 2016: Progress on Identifying and Characterizing the Human Proteome, Including Post-Translational Modifications.
Journal of proteome research
2016; 15 (11): 3951-3960
The HUPO Human Proteome Project (HPP) has two overall goals: (1) stepwise completion of the protein parts list-the draft human proteome including confidently identifying and characterizing at least one protein product from each protein-coding gene, with increasing emphasis on sequence variants, post-translational modifications (PTMs), and splice isoforms of those proteins; and (2) making proteomics an integrated counterpart to genomics throughout the biomedical and life sciences community. PeptideAtlas and GPMDB reanalyze all major human mass spectrometry data sets available through ProteomeXchange with standardized protocols and stringent quality filters; neXtProt curates and integrates mass spectrometry and other findings to present the most up to date authorative compendium of the human proteome. The HPP Guidelines for Mass Spectrometry Data Interpretation version 2.1 were applied to manuscripts submitted for this 2016 C-HPP-led special issue [ www.thehpp.org/guidelines ]. The Human Proteome presented as neXtProt version 2016-02 has 16,518 confident protein identifications (Protein Existence [PE] Level 1), up from 13,664 at 2012-12, 15,646 at 2013-09, and 16,491 at 2014-10. There are 485 proteins that would have been PE1 under the Guidelines v1.0 from 2012 but now have insufficient evidence due to the agreed-upon more stringent Guidelines v2.0 to reduce false positives. neXtProt and PeptideAtlas now both require two non-nested, uniquely mapping (proteotypic) peptides of at least 9 aa in length. There are 2,949 missing proteins (PE2+3+4) as the baseline for submissions for this fourth annual C-HPP special issue of Journal of Proteome Research. PeptideAtlas has 14,629 canonical (plus 1187 uncertain and 1755 redundant) entries. GPMDB has 16,190 EC4 entries, and the Human Protein Atlas has 10,475 entries with supportive evidence. neXtProt, PeptideAtlas, and GPMDB are rich resources of information about post-translational modifications (PTMs), single amino acid variants (SAAVSs), and splice isoforms. Meanwhile, the Biology- and Disease-driven (B/D)-HPP has created comprehensive SRM resources, generated popular protein lists to guide targeted proteomics assays for specific diseases, and launched an Early Career Researchers initiative.
View details for DOI 10.1021/acs.jproteome.6b00511
View details for PubMedID 27487407
View details for PubMedCentralID PMC5129622
Gene-specific correlation of RNA and protein levels in human cells and tissues.
Molecular systems biology
2016; 12 (10): 883
An important issue for molecular biology is to establish whether transcript levels of a given gene can be used as proxies for the corresponding protein levels. Here, we have developed a targeted proteomics approach for a set of human non-secreted proteins based on parallel reaction monitoring to measure, at steady-state conditions, absolute protein copy numbers across human tissues and cell lines and compared these levels with the corresponding mRNA levels using transcriptomics. The study shows that the transcript and protein levels do not correlate well unless a gene-specific RNA-to-protein (RTP) conversion factor independent of the tissue type is introduced, thus significantly enhancing the predictability of protein copy numbers from RNA levels. The results show that the RTP ratio varies significantly with a few hundred copies per mRNA molecule for some genes to several hundred thousands of protein copies per mRNA molecule for others. In conclusion, our data suggest that transcriptome analysis can be used as a tool to predict the protein copy numbers per cell, thus forming an attractive link between the field of genomics and proteomics.
View details for DOI 10.15252/msb.20167144
View details for PubMedID 27951527
View details for PubMedCentralID PMC5081484
- Voices of biotech. Nature biotechnology 2016; 34 (3): 270-275
Introducing the Affinity Binder Knockdown InitiativeA publicprivate partnership for validation of affinity reagents.
EuPA open proteomics
2016; 10: 56-58
The newly launched Affinity Binder Knockdown Initiative encourages antibody suppliers and users to join this publicprivate partnership, which uses crowdsourcing to collect characterization data on antibodies. Researchers are asked to share validation data from experiments where gene-editing techniques (such as siRNA or CRISPR) have been used to verify antibody binding. The initiative is launched under the aegis of Antibodypedia, a database designed to allow comparisons and scoring of publicly available antibodies towards human protein targets. What is known about an antibody is the foundation of the scoring and ranking system in Antibodypedia.
View details for DOI 10.1016/j.euprot.2016.01.002
View details for PubMedID 29900101
View details for PubMedCentralID PMC5988587
Towards a functional definition of the mitochondrial human proteome.
EuPA open proteomics
2016; 10: 24-27
The mitochondrial human proteome project (mt-HPP) was initiated by the Italian HPP group as a part of both the chromosome-centric initiative (C-HPP) and the biology and disease driven initiative (B/D-HPP). In recent years several reports highlighted how mitochondrial biology and disease are regulated by specific interactions with non-mitochondrial proteins. Thus, it is of great relevance to extend our present view of the mitochondrial proteome not only to those proteins that are encoded by or transported to mitochondria, but also to their interactors that take part in mitochondria functionality. Here, we propose a graphical representation of the functional mitochondrial proteome by retrieving mitochondrial proteins from the NeXtProt database and adding to the network their interactors as annotated in the IntAct database. Notably, the network may represent a reference to map all the proteins that are currently being identified in mitochondrial proteomics studies.
View details for DOI 10.1016/j.euprot.2016.01.004
View details for PubMedID 29900096
View details for PubMedCentralID PMC5988588
The folate-coupled enzyme MTHFD2 is a nuclear protein and promotes cell proliferation.
2015; 5: 15029
Folate metabolism is central to cell proliferation and a target of commonly used cancer chemotherapeutics. In particular, the mitochondrial folate-coupled metabolism is thought to be important for proliferating cancer cells. The enzyme MTHFD2 in this pathway is highly expressed in human tumors and broadly required for survival of cancer cells. Although the enzymatic activity of the MTHFD2 protein is well understood, little is known about its larger role in cancer cell biology. We here report that MTHFD2 is co-expressed with two distinct gene sets, representing amino acid metabolism and cell proliferation, respectively. Consistent with a role for MTHFD2 in cell proliferation, MTHFD2 expression was repressed in cells rendered quiescent by deprivation of growth signals (serum) and rapidly re-induced by serum stimulation. Overexpression of MTHFD2 alone was sufficient to promote cell proliferation independent of its dehydrogenase activity, even during growth restriction. In addition to its known mitochondrial localization, we found MTHFD2 to have a nuclear localization and co-localize with DNA replication sites. These findings suggest a previously unknown role for MTHFD2 in cancer cell proliferation, adding to its known function in mitochondrial folate metabolism.
View details for DOI 10.1038/srep15029
View details for PubMedID 26461067
View details for PubMedCentralID PMC4602236
Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification.
Journal of proteome research
2015; 14 (9): 3452-60
Remarkable progress continues on the annotation of the proteins identified in the Human Proteome and on finding credible proteomic evidence for the expression of "missing proteins". Missing proteins are those with no previous protein-level evidence or insufficient evidence to make a confident identification upon reanalysis in PeptideAtlas and curation in neXtProt. Enhanced with several major new data sets published in 2014, the human proteome presented as neXtProt, version 2014-09-19, has 16,491 unique confident proteins (PE level 1), up from 13,664 at 2012-12 and 15,646 at 2013-09. That leaves 2948 missing proteins from genes classified having protein existence level PE 2, 3, or 4, as well as 616 dubious proteins at PE 5. Here, we document the progress of the HPP and discuss the importance of assessing the quality of evidence, confirming automated findings and considering alternative protein matches for spectra and peptides. We provide guidelines for proteomics investigators to apply in reporting newly identified proteins.
View details for DOI 10.1021/acs.jproteome.5b00499
View details for PubMedID 26155816
View details for PubMedCentralID PMC4755311
Quest for Missing Proteins: Update 2015 on Chromosome-Centric Human Proteome Project.
Journal of proteome research
2015; 14 (9): 3415-31
This paper summarizes the recent activities of the Chromosome-Centric Human Proteome Project (C-HPP) consortium, which develops new technologies to identify yet-to-be annotated proteins (termed "missing proteins") in biological samples that lack sufficient experimental evidence at the protein level for confident protein identification. The C-HPP also aims to identify new protein forms that may be caused by genetic variability, post-translational modifications, and alternative splicing. Proteogenomic data integration forms the basis of the C-HPP's activities; therefore, we have summarized some of the key approaches and their roles in the project. We present new analytical technologies that improve the chemical space and lower detection limits coupled to bioinformatics tools and some publicly available resources that can be used to improve data analysis or support the development of analytical assays. Most of this paper's content has been compiled from posters, slides, and discussions presented in the series of C-HPP workshops held during 2014. All data (posters, presentations) used are available at the C-HPP Wiki (http://c-hpp.webhosting.rug.nl/) and in the Supporting Information.
View details for DOI 10.1021/pr5013009
View details for PubMedID 26076068
Analysis of the Human Prostate-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling Identifies TMEM79 and ACOXL as Two Putative, Diagnostic Markers in Prostate Cancer.
2015; 10 (8): e0133449
To better understand prostate function and disease, it is important to define and explore the molecular constituents that signify the prostate gland. The aim of this study was to define the prostate specific transcriptome and proteome, in comparison to 26 other human tissues. Deep sequencing of mRNA (RNA-seq) and immunohistochemistry-based protein profiling were combined to identify prostate specific gene expression patterns and to explore tissue biomarkers for potential clinical use in prostate cancer diagnostics. We identified 203 genes with elevated expression in the prostate, 22 of which showed more than five-fold higher expression levels compared to all other tissue types. In addition to previously well-known proteins we identified two poorly characterized proteins, TMEM79 and ACOXL, with potential to differentiate between benign and cancerous prostatic glands in tissue biopsies. In conclusion, we have applied a genome-wide analysis to identify the prostate specific proteome using transcriptomics and antibody-based protein profiling to identify genes with elevated expression in the prostate. Our data provides a starting point for further functional studies to explore the molecular repertoire of normal and diseased prostate including potential prostate cancer markers such as TMEM79 and ACOXL.
View details for DOI 10.1371/journal.pone.0133449
View details for PubMedID 26237329
View details for PubMedCentralID PMC4523174
The human liver-specific proteome defined by transcriptomics and antibody-based profiling.
FASEB journal : official publication of the Federation of American Societies for Experimental Biology
2014; 28 (7): 2901-14
Human liver physiology and the genetic etiology of the liver diseases can potentially be elucidated through the identification of proteins with enriched expression in the liver. Here, we combined data from RNA sequencing (RNA-Seq) and antibody-based immunohistochemistry across all major human tissues to explore the human liver proteome with enriched expression, as well as the cell type-enriched expression in hepatocyte and bile duct cells. We identified in total 477 protein-coding genes with elevated expression in the liver: 179 genes have higher expression as compared to all the other analyzed tissues; 164 genes have elevated transcript levels in the liver shared with at least one other tissue type; and an additional 134 genes have a mild level of increased expression in the liver. We identified the precise localization of these proteins through antibody-based protein profiling and the subcellular localization of these proteins through immunofluorescent-based profiling. We also identified the biological processes and metabolic functions associated with these proteins, investigated their contribution in the occurrence of liver diseases, and identified potential targets for their treatment. Our study demonstrates the use of RNA-Seq and antibody-based immunohistochemistry for characterizing the human liver proteome, as well as the use of tissue-specific proteins in identification of novel drug targets and discovery of biomarkers.-Kampf, C., Mardinoglu, A., Fagerberg, L., Hallström, B. M., Edlund, K., Lundberg, E., Pontén, F., Nielsen, J., Uhlen, M. The human liver-specific proteome defined by transcriptomics and antibody-based profiling.
View details for DOI 10.1096/fj.14-250555
View details for PubMedID 24648543
Immunoproteomics using polyclonal antibodies and stable isotope-labeled affinity-purified recombinant proteins.
Molecular & cellular proteomics : MCP
2014; 13 (6): 1611-24
The combination of immuno-based methods and mass spectrometry detection has great potential in the field of quantitative proteomics. Here, we describe a new method (immuno-SILAC) for the absolute quantification of proteins in complex samples based on polyclonal antibodies and stable isotope-labeled recombinant protein fragments to allow affinity enrichment prior to mass spectrometry analysis and accurate quantification. We took advantage of the antibody resources publicly available from the Human Protein Atlas project covering more than 80% of all human protein-coding genes. Epitope mapping revealed that a majority of the polyclonal antibodies recognized multiple linear epitopes, and based on these results, a semi-automated method was developed for peptide enrichment using polyclonal antibodies immobilized on protein A-coated magnetic beads. A protocol based on the simultaneous multiplex capture of more than 40 protein targets showed that approximately half of the antibodies enriched at least one functional peptide detected in the subsequent mass spectrometry analysis. The approach was further developed to also generate quantitative data via the addition of heavy isotope-labeled recombinant protein fragment standards prior to trypsin digestion. Here, we show that we were able to use small amounts of antibodies (50 ng per target) in this manner for efficient multiplex analysis of quantitative levels of proteins in a human HeLa cell lysate. The results suggest that polyclonal antibodies generated via immunization of recombinant protein fragments could be used for the enrichment of target peptides to allow for rapid mass spectrometry analysis taking advantage of a substantial reduction in sample complexity. The possibility of building up a proteome-wide resource for immuno-SILAC assays based on publicly available antibody resources is discussed.
View details for DOI 10.1074/mcp.M113.034140
View details for PubMedID 24722731
View details for PubMedCentralID PMC4047479
RNA- and antibody-based profiling of the human proteome with focus on chromosome 19.
Journal of proteome research
2014; 13 (4): 2019-27
An important part of the Human Proteome Project is to characterize the protein complement of the genome with antibody-based profiling. Within the framework of this effort, a new version 12 of the Human Protein Atlas ( www.proteinatlas.org ) has been launched, including transcriptomics data for 27 tissues and 44 cell lines to complement the protein expression data from antibody-based profiling. Besides the extensive addition of transcriptomics data, the Human Protein Atlas now contains antibody-based protein profiles for 82% of the 20 329 putative protein-coding genes. The comprehensive data resulting from RNA-seq analysis and antibody-based profiling performed within the Human Protein Atlas as well as information from UniProt were used to generate evidence summary scores for each of the 20 329 genes, of which 94% now have experimental evidence at least at transcript level. The evidence scores for all individual genes are displayed with regards to both RNA- and antibody-based protein profiles, including chromosome-centric visualizations. An analysis of the human chromosome 19 shows that ∼43% of the genes are expressed at the transcript level in all 27 tissues analyzed, suggesting a "house-keeping" function, while 12% of the genes show a more tissue-specific pattern with enriched expression in one of the analyzed tissues only.
View details for DOI 10.1021/pr401156g
View details for PubMedID 24579871
A chromosome-centric analysis of antibodies directed toward the human proteome using Antibodypedia.
Journal of proteome research
2014; 13 (3): 1669-76
Antibodies are crucial for the study of human proteins and have been defined as one of the three pillars in the human chromosome-centric Human Proteome Project (C-HPP). In this article the chromosome-centric structure has been used to analyze the availability of antibodies as judged by the presence within the portal Antibodypedia, a database designed to allow comparisons and scoring of publicly available antibodies toward human protein targets. This public database displays antibody data from more than one million antibodies toward human protein targets. A summary of the content in this knowledge resource reveals that there exist more than 10 antibodies to over 70% of all the putative human genes, evenly distributed over the 24 human chromosomes. The analysis also shows that at present, less than 10% of the putative human protein-coding genes (n = 1882) predicted from the genome sequence lack antibodies, suggesting that focused efforts from the antibody-based and mass spectrometry-based proteomic communities should be encouraged to pursue the analysis of these missing proteins. We show that Antibodypedia may be used to track the development of available and validated antibodies to the individual chromosomes, and thus the database is an attractive tool to identify proteins with no or few antibodies yet generated.
View details for DOI 10.1021/pr4011525
View details for PubMedID 24533432
- Molecular- and Organelle-Based Predictive Paradigm Underlying Recovery by Left Ventricular Assist Device Support CIRCULATION-HEART FAILURE 2014; 7 (2): 359-366
Antibody performance in western blot applications is context-dependent.
2014; 9 (3): 435-45
An important concern for the use of antibodies in various applications, such as western blot (WB) or immunohistochemistry (IHC), is specificity. This calls for systematic validations using well-designed conditions. Here, we have analyzed 13 000 antibodies using western blot with lysates from human cell lines, tissues, and plasma. Standardized stratification showed that 45% of the antibodies yielded supportive staining, and the rest either no staining (12%) or protein bands of wrong size (43%). A comparative study of WB and IHC showed that the performance of antibodies is application-specific, although a correlation between no WB staining and weak IHC staining could be seen. To investigate the influence of protein abundance on the apparent specificity of the antibody, new WB analyses were performed for 1369 genes that gave unsupportive WBs in the initial screening using cell lysates with overexpressed full-length proteins. Then, more than 82% of the antibodies yielded a specific band corresponding to the full-length protein. Hence, the vast majority of the antibodies (90%) used in this study specifically recognize the target protein when present at sufficiently high levels. This demonstrates the context- and application-dependence of antibody validation and emphasizes that caution is needed when annotating binding reagents as specific or cross-reactive. WB is one of the most commonly used methods for validation of antibodies. Our data implicate that solely using one platform for antibody validation might give misleading information and therefore at least one additional method should be used to verify the achieved data.
View details for DOI 10.1002/biot.201300341
View details for PubMedID 24403002
Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics.
Molecular & cellular proteomics : MCP
2014; 13 (2): 397-406
Global classification of the human proteins with regards to spatial expression patterns across organs and tissues is important for studies of human biology and disease. Here, we used a quantitative transcriptomics analysis (RNA-Seq) to classify the tissue-specific expression of genes across a representative set of all major human organs and tissues and combined this analysis with antibody-based profiling of the same tissues. To present the data, we launch a new version of the Human Protein Atlas that integrates RNA and protein expression data corresponding to ∼80% of the human protein-coding genes with access to the primary data for both the RNA and the protein analysis on an individual gene level. We present a classification of all human protein-coding genes with regards to tissue-specificity and spatial expression pattern. The integrative human expression map can be used as a starting point to explore the molecular constituents of the human body.
View details for DOI 10.1074/mcp.M113.035600
View details for PubMedID 24309898
View details for PubMedCentralID PMC3916642
Metrics for the Human Proteome Project 2013-2014 and strategies for finding missing proteins.
Journal of proteome research
2014; 13 (1): 15-20
One year ago the Human Proteome Project (HPP) leadership designated the baseline metrics for the Human Proteome Project to be based on neXtProt with a total of 13,664 proteins validated at protein evidence level 1 (PE1) by mass spectrometry, antibody-capture, Edman sequencing, or 3D structures. Corresponding chromosome-specific data were provided from PeptideAtlas, GPMdb, and Human Protein Atlas. This year, the neXtProt total is 15,646 and the other resources, which are inputs to neXtProt, have high-quality identifications and additional annotations for 14,012 in PeptideAtlas, 14,869 in GPMdb, and 10,976 in HPA. We propose to remove 638 genes from the denominator that are "uncertain" or "dubious" in Ensembl, UniProt/SwissProt, and neXtProt. That leaves 3844 "missing proteins", currently having no or inadequate documentation, to be found from a new denominator of 19,490 protein-coding genes. We present those tabulations and web links and discuss current strategies to find the missing proteins.
View details for DOI 10.1021/pr401144x
View details for PubMedID 24364385
View details for PubMedCentralID PMC3928647
Contribution of antibody-based protein profiling to the human Chromosome-centric Proteome Project (C-HPP).
Journal of proteome research
2013; 12 (6): 2439-48
A gene-centric Human Proteome Project has been proposed to characterize the human protein-coding genes in a chromosome-centered manner to understand human biology and disease. Here, we report on the protein evidence for all genes predicted from the genome sequence based on manual annotation from literature (UniProt), antibody-based profiling in cells, tissues and organs and analysis of the transcript profiles using next generation sequencing in human cell lines of different origins. We estimate that there is good evidence for protein existence for 69% (n = 13985) of the human protein-coding genes, while 23% have only evidence on the RNA level and 7% still lack experimental evidence. Analysis of the expression patterns shows few tissue-specific proteins and approximately half of the genes expressed in all the analyzed cells. The status for each gene with regards to protein evidence is visualized in a chromosome-centric manner as part of a new version of the Human Protein Atlas ( www.proteinatlas.org ).
View details for DOI 10.1021/pr300924j
View details for PubMedID 23276153
Initial quantitative proteomic map of 28 mouse tissues using the SILAC mouse.
Molecular & cellular proteomics : MCP
2013; 12 (6): 1709-22
Identifying the building blocks of mammalian tissues is a precondition for understanding their function. In particular, global and quantitative analysis of the proteome of mammalian tissues would point to tissue-specific mechanisms and place the function of each protein in a whole-organism perspective. We performed proteomic analyses of 28 mouse tissues using high-resolution mass spectrometry and used a mix of mouse tissues labeled via stable isotope labeling with amino acids in cell culture as a "spike-in" internal standard for accurate protein quantification across these tissues. We identified a total of 7,349 proteins and quantified 6,974 of them. Bioinformatic data analysis showed that physiologically related tissues clustered together and that highly expressed proteins represented the characteristic tissue functions. Tissue specialization was reflected prominently in the proteomic profiles and is apparent already in their hundred most abundant proteins. The proportion of strictly tissue-specific proteins appeared to be small. However, even proteins with household functions, such as those in ribosomes and spliceosomes, can have dramatic expression differences among tissues. We describe a computational framework with which to correlate proteome profiles with physiological functions of the tissue. Our data will be useful to the broad scientific community as an initial atlas of protein expression of a mammalian species.
View details for DOI 10.1074/mcp.M112.024919
View details for PubMedID 23436904
View details for PubMedCentralID PMC3675825
A texture based pattern recognition approach to distinguish melanoma from non-melanoma cells in histopathological tissue microarray sections.
2013; 8 (5): e62070
Immunohistochemistry is a routine practice in clinical cancer diagnostics and also an established technology for tissue-based research regarding biomarker discovery efforts. Tedious manual assessment of immunohistochemically stained tissue needs to be fully automated to take full advantage of the potential for high throughput analyses enabled by tissue microarrays and digital pathology. Such automated tools also need to be reproducible for different experimental conditions and biomarker targets. In this study we present a novel supervised melanoma specific pattern recognition approach that is fully automated and quantitative.Melanoma samples were immunostained for the melanocyte specific target, Melan-A. Images representing immunostained melanoma tissue were then digitally processed to segment regions of interest, highlighting Melan-A positive and negative areas. Color deconvolution was applied to each region of interest to separate the channel containing the immunohistochemistry signal from the hematoxylin counterstaining channel. A support vector machine melanoma classification model was learned from a discovery melanoma patient cohort (n = 264) and subsequently validated on an independent cohort of melanoma patient tissue sample images (n = 157).Here we propose a novel method that takes advantage of utilizing an immuhistochemical marker highlighting melanocytes to fully automate the learning of a general melanoma cell classification model. The presented method can be applied on any protein of interest and thus provides a tool for quantification of immunohistochemistry-based protein expression in melanoma.
View details for DOI 10.1371/journal.pone.0062070
View details for PubMedID 23690928
View details for PubMedCentralID PMC3656869
Majority of differentially expressed genes are down-regulated during malignant transformation in a four-stage model.
Proceedings of the National Academy of Sciences of the United States of America
2013; 110 (17): 6853-8
The transformation of normal cells to malignant, metastatic tumor cells is a multistep process caused by the sequential acquirement of genetic changes. To identify these changes, we compared the transcriptomes and levels and distribution of proteins in a four-stage cell model of isogenically matched normal, immortalized, transformed, and metastatic human cells, using deep transcriptome sequencing and immunofluorescence microscopy. The data show that ∼6% (n = 1,357) of the human protein-coding genes are differentially expressed across the stages in the model. Interestingly, the majority of these genes are down-regulated, linking malignant transformation to dedifferentiation. The up-regulated genes are mainly components that control cellular proliferation, whereas the down-regulated genes consist of proteins exposed on or secreted from the cell surface. As many of the identified gene products control basic cellular functions that are defective in cancers, the data provide candidates for follow-up studies to investigate their functional roles in tumor formation. When we further compared the expression levels of four of the identified proteins in clinical cancer cohorts, similar differences were observed between benign and cancer cells, as in the cell model. This shows that this comprehensive demonstration of the molecular changes underlying malignant transformation is a relevant model to study the process of tumor formation.
View details for DOI 10.1073/pnas.1216436110
View details for PubMedID 23569271
View details for PubMedCentralID PMC3637701
Immunofluorescence and fluorescent-protein tagging show high correlation for protein localization in mammalian cells.
2013; 10 (4): 315-23
Imaging techniques such as immunofluorescence (IF) and the expression of fluorescent protein (FP) fusions are widely used to investigate the subcellular distribution of proteins. Here we report a systematic analysis of >500 human proteins comparing the localizations obtained in live versus fixed cells using FPs and IF, respectively. We identify systematic discrepancies between IF and FPs as well as between FP tagging at the N and C termini. The analysis shows that for 80% of the proteins, IF and FPs yield the same subcellular distribution, and the locations of 250 previously unlocalized proteins were determined by the overlap between the two methods. Approximately 60% of proteins localize to multiple organelles for both methods, indicating a complex subcellular protein organization. These results show that both IF and FP tagging are reliable techniques and demonstrate the usefulness of an integrative approach for a complete investigation of the subcellular human proteome.
View details for DOI 10.1038/nmeth.2377
View details for PubMedID 23435261
Centrosome isolation and analysis by mass spectrometry-based proteomics.
Methods in enzymology
2013; 525: 371-93
Centrioles are microtubule-based scaffolds that are essential for the formation of centrosomes, cilia, and flagella with important functions throughout the cell cycle, in physiology and during development. The ability to purify centriole-containing organelles on a large scale, combined with advances in protein identification using mass spectrometry-based proteomics, have revealed multiple centriole-associated proteins that are conserved during evolution in eukaryotes. Despite these advances, the molecular basis for the plethora of processes coordinated by cilia and centrosomes is not fully understood. Considering the complexity and dynamics of centriole-related proteomes and the first-pass analyses reported so far, it is likely that further insight might come from more thorough proteome analyses under various cellular and physiological conditions. To this end, we here describe methods to isolate centrosomes from human cells and strategies to selectively identify and study the properties of the associated proteins using quantitative mass spectrometry-based proteomics.
View details for DOI 10.1016/B978-0-12-397944-5.00018-3
View details for PubMedID 23522479
RNA deep sequencing as a tool for selection of cell lines for systematic subcellular localization of all human proteins.
Journal of proteome research
2013; 12 (1): 299-307
One of the major challenges of a chromosome-centric proteome project is to explore in a systematic manner the potential proteins identified from the chromosomal genome sequence, but not yet characterized on a protein level. Here, we describe the use of RNA deep sequencing to screen human cell lines for RNA profiles and to use this information to select cell lines suitable for characterization of the corresponding gene product. In this manner, the subcellular localization of proteins can be analyzed systematically using antibody-based confocal microscopy. We demonstrate the usefulness of selecting cell lines with high expression levels of RNA transcripts to increase the likelihood of high quality immunofluorescence staining and subsequent successful subcellular localization of the corresponding protein. The results show a path to combine transcriptomics with affinity proteomics to characterize the proteins in a gene- or chromosome-centric manner.
View details for DOI 10.1021/pr3009308
View details for PubMedID 23227862
A Chromosome-centric Human Proteome Project (C-HPP) to Characterize the Sets of Proteins Encoded in Chromosome 17
JOURNAL OF PROTEOME RESEARCH
2013; 12 (1): 45-57
We report progress assembling the parts list for chromosome 17 and illustrate the various processes that we have developed to integrate available data from diverse genomic and proteomic knowledge bases. As primary resources, we have used GPMDB, neXtProt, PeptideAtlas, Human Protein Atlas (HPA), and GeneCards. All sites share the common resource of Ensembl for the genome modeling information. We have defined the chromosome 17 parts list with the following information: 1169 protein-coding genes, the numbers of proteins confidently identified by various experimental approaches as documented in GPMDB, neXtProt, PeptideAtlas, and HPA, examples of typical data sets obtained by RNASeq and proteomic studies of epithelial derived tumor cell lines (disease proteome) and a normal proteome (peripheral mononuclear cells), reported evidence of post-translational modifications, and examples of alternative splice variants (ASVs). We have constructed a list of the 59 "missing" proteins as well as 201 proteins that have inconclusive mass spectrometric (MS) identifications. In this report we have defined a process to establish a baseline for the incorporation of new evidence on protein identification and characterization as well as related information from transcriptome analyses. This initial list of "missing" proteins that will guide the selection of appropriate samples for discovery studies as well as antibody reagents. Also we have illustrated the significant diversity of protein variants (including post-translational modifications, PTMs) using regions on chromosome 17 that contain important oncogenes. We emphasize the need for mandated deposition of proteomics data in public databases, the further development of improved PTM, ASV, and single nucleotide variant (SNV) databases, and the construction of Web sites that can integrate and regularly update such information. In addition, we describe the distribution of both clustered and scattered sets of protein families on the chromosome. Since chromosome 17 is rich in cancer-associated genes, we have focused the clustering of cancer-associated genes in such genomic regions and have used the ERBB2 amplicon as an example of the value of a proteogenomic approach in which one integrates transcriptomic with proteomic information and captures evidence of coexpression through coordinated regulation.
View details for DOI 10.1021/pr300985j
View details for Web of Science ID 000313156300007
View details for PubMedID 23259914
Automated analysis and reannotation of subcellular locations in confocal images from the Human Protein Atlas.
2012; 7 (11): e50514
The Human Protein Atlas contains immunofluorescence images showing subcellular locations for thousands of proteins. These are currently annotated by visual inspection. In this paper, we describe automated approaches to analyze the images and their use to improve annotation. We began by training classifiers to recognize the annotated patterns. By ranking proteins according to the confidence of the classifier, we generated a list of proteins that were strong candidates for reexamination. In parallel, we applied hierarchical clustering to group proteins and identified proteins whose annotations were inconsistent with the remainder of the proteins in their cluster. These proteins were reexamined by the original annotators, and a significant fraction had their annotations changed. The results demonstrate that automated approaches can provide an important complement to visual annotation.
View details for DOI 10.1371/journal.pone.0050514
View details for PubMedID 23226299
View details for PubMedCentralID PMC3511558
Estimating microtubule distributions from 2D immunofluorescence microscopy images reveals differences among human cultured cell lines.
2012; 7 (11): e50292
Microtubules are filamentous structures that are involved in several important cellular processes, including cell division, cellular structure and mechanics, and intracellular transportation. Little is known about potential differences in microtubule distributions within and across cell lines. Here we describe a method to estimate information pertaining to 3D microtubule distributions from 2D fluorescence images. Our method allows for quantitative comparisons of microtubule distribution parameters (number of microtubules, mean length) between different cell lines. Among eleven cell lines compared, some showed differences that could be accounted for by differences in the total amount of tubulin per cell while others showed statistically significant differences in the balance between number and length of microtubules. We also observed that some cell lines that visually appear different in their microtubule distributions are quite similar when the model parameters are considered. The method is expected to be generally useful for comparing microtubule distributions between cell lines and for a given cell line after various perturbations. The results are also expected to enable analysis of the differences in gene expression underlying the observed differences in microtubule distributions among cell types.
View details for DOI 10.1371/journal.pone.0050292
View details for PubMedID 23209697
View details for PubMedCentralID PMC3508979
Comprehensive analysis of the genome transcriptome and proteome landscapes of three tumor cell lines.
2012; 4 (11): 86
We here present a comparative genome, transcriptome and functional network analysis of three human cancer cell lines (A431, U251MG and U2OS), and investigate their relation to protein expression. Gene copy numbers significantly influenced corresponding transcript levels; their effect on protein levels was less pronounced. We focused on genes with altered mRNA and/or protein levels to identify those active in tumor maintenance. We provide comprehensive information for the three genomes and demonstrate the advantage of integrative analysis for identifying tumor-related genes amidst numerous background mutations by relating genomic variation to expression/protein abundance data and use gene networks to reveal implicated pathways.
View details for DOI 10.1186/gm387
View details for PubMedID 23158748
View details for PubMedCentralID PMC3580420
Comparison of total and cytoplasmic mRNA reveals global regulation by nuclear retention and miRNAs.
2012; 13: 574
The majority of published gene-expression studies have used RNA isolated from whole cells, overlooking the potential impact of including nuclear transcriptome in the analyses. In this study, mRNA fractions from the cytoplasm and from whole cells (total RNA) were prepared from three human cell lines and sequenced using massive parallel sequencing.For all three cell lines, of about 15000 detected genes approximately 400 to 1400 genes were detected in different amounts in the cytoplasmic and total RNA fractions. Transcripts detected at higher levels in the total RNA fraction had longer coding sequences and higher number of miRNA target sites. Transcripts detected at higher levels in the cytoplasmic fraction were shorter or contained shorter untranslated regions. Nuclear retention of transcripts and mRNA degradation via miRNA pathway might contribute to this differential detection of genes. The consequence of the differential detection was further investigated by comparison to proteomics data. Interestingly, the expression profiles of cytoplasmic and total RNA correlated equally well with protein abundance levels indicating regulation at a higher level.We conclude that expression levels derived from the total RNA fraction be regarded as an appropriate estimate of the amount of mRNAs present in a given cell population, independent of the coding sequence length or UTRs.
View details for DOI 10.1186/1471-2164-13-574
View details for PubMedID 23110385
View details for PubMedCentralID PMC3495644
Proteomic Analysis Reveals Drug Accessible Cell Surface N-Glycoproteins of Primary and Established Glioblastoma Cell Lines
JOURNAL OF PROTEOME RESEARCH
2012; 11 (10): 4885-4893
Glioblastoma is the most common primary brain tumor in adults with low average survival time after diagnosis. In order to improve glioblastoma treatment, new drug-accessible targets need to be identified. Cell surface glycoproteins are prime drug targets due to their accessibility at the surface of cancer cells. To overcome the limited availability of suitable antibodies for cell surface protein detection, we performed a comprehensive mass spectrometric investigation of the glioblastoma surfaceome. Our combined cell surface capturing analysis of primary ex vivo glioblastoma cell lines in combination with established glioblastoma cell lines revealed 633 N-glycoproteins, which vastly extends the known data of surfaceome drug targets at subcellular resolution. We provide direct evidence of common glioblastoma cell surface glycoproteins and an approximate estimate of their abundances, information that could not be derived from genomic and/or transcriptomic glioblastoma studies. Apart from our pharmaceutically valuable repertoire of already and potentially drug-accessible cell surface glycoproteins, we built a mass-spectrometry-based toolbox enabling directed, sensitive, and repetitive glycoprotein measurements for clinical follow-up studies. The included Skyline Glioblastoma SRM assay library provides an elevated starting point for parallel testing of the abundance level of the detected glioblastoma surfaceome members in future drug perturbation experiments.
View details for DOI 10.1021/pr300360a
View details for Web of Science ID 000309441000011
View details for PubMedID 22909291
A tool to facilitate clinical biomarker studies--a tissue dictionary based on the Human Protein Atlas.
2012; 10: 103
The complexity of tissue and the alterations that distinguish normal from cancer remain a challenge for translating results from tumor biological studies into clinical medicine. This has generated an unmet need to exploit the findings from studies based on cell lines and model organisms to develop, validate and clinically apply novel diagnostic, prognostic and treatment predictive markers. As one step to meet this challenge, the Human Protein Atlas project has been set up to produce antibodies towards human protein targets corresponding to all human protein coding genes and to map protein expression in normal human tissues, cancer and cells. Here, we present a dictionary based on microscopy images created as an amendment to the Human Protein Atlas. The aim of the dictionary is to facilitate the interpretation and use of the image-based data available in the Human Protein Atlas, but also to serve as a tool for training and understanding tissue histology, pathology and cell biology. The dictionary contains three main parts, normal tissues, cancer tissues and cells, and is based on high-resolution images at different magnifications of full tissue sections stained with H & E. The cell atlas is centered on immunofluorescence and confocal microscopy images, using different color channels to highlight the organelle structure of a cell. Here, we explain how this dictionary can be used as a tool to aid clinicians and scientists in understanding the use of tissue histology and cancer pathology in diagnostics and biomarker studies.
View details for DOI 10.1186/1741-7015-10-103
View details for PubMedID 22971420
View details for PubMedCentralID PMC3523031
Systematic validation of antibody binding and protein subcellular localization using siRNA and confocal microscopy.
Journal of proteomics
2012; 75 (7): 2236-51
We have developed a platform for validation of antibody binding and protein subcellular localization data obtained from immunofluorescence using siRNA technology combined with automated confocal microscopy and image analysis. By combining the siRNA technology with automated sample preparation, automated imaging and quantitative image analysis, a high-throughput assay has been set-up to enable confirmation of accurate protein binding and localization in a systematic manner. Here, we describe the analysis and validation of the subcellular location of 65 human proteins, targeted by 75 antibodies and silenced by 130 siRNAs. A large fraction of (80%) the subcellular locations, including locations of several previously uncharacterized proteins, could be confirmed by the significant down-regulation of the antibody signal after the siRNA silencing. A quantitative analysis was set-up using automated image analysis to facilitate studies of targets found in more than one compartment. The results obtained using the platform demonstrate that siRNA silencing in combination with quantitative image analysis of antibody signals in different compartments of the cells is an attractive approach for ensuring accurate protein localization as well as antibody binding using immunofluorescence. With a large fraction of the human proteome still unexplored, we suggest this approach to be of great importance under the continued work of mapping the human proteome on a subcellular level.
View details for DOI 10.1016/j.jprot.2012.01.030
View details for PubMedID 22361696
Identification of autophagosome-associated proteins and regulators by quantitative proteomic analysis and genetic screens.
Molecular & cellular proteomics : MCP
2012; 11 (3): M111.014035
Autophagy is one of the major intracellular catabolic pathways, but little is known about the composition of autophagosomes. To study the associated proteins, we isolated autophagosomes from human breast cancer cells using two different biochemical methods and three stimulus types: amino acid deprivation or rapamycin or concanamycin A treatment. The autophagosome-associated proteins were dependent on stimulus, but a core set of proteins was stimulus-independent. Remarkably, proteasomal proteins were abundant among the stimulus-independent common autophagosome-associated proteins, and the activation of autophagy significantly decreased the cellular proteasome level and activity supporting interplay between the two degradation pathways. A screen of yeast strains defective in the orthologs of the human genes encoding for a common set of autophagosome-associated proteins revealed several regulators of autophagy, including subunits of the retromer complex. The combined spatiotemporal proteomic and genetic data sets presented here provide a basis for further characterization of autophagosome biogenesis and cargo selection.
View details for DOI 10.1074/mcp.M111.014035
View details for PubMedID 22311637
View details for PubMedCentralID PMC3316729
Antibody-based protein profiling of the human chromosome 21.
Molecular & cellular proteomics : MCP
2012; 11 (3): M111.013458
The Human Proteome Project has been proposed to create a knowledge-based resource based on a systematical mapping of all human proteins, chromosome by chromosome, in a gene-centric manner. With this background, we here describe the systematic analysis of chromosome 21 using an antibody-based approach for protein profiling using both confocal microscopy and immunohistochemistry, complemented with transcript profiling using next generation sequencing data. We also describe a new approach for protein isoform analysis using a combination of antibody-based probing and isoelectric focusing. The analysis has identified several genes on chromosome 21 with no previous evidence on the protein level, and the isoform analysis indicates that a large fraction of human proteins have multiple isoforms. A chromosome-wide matrix is presented with status for all chromosome 21 genes regarding subcellular localization, tissue distribution, and molecular characterization of the corresponding proteins. The path to generate a chromosome-specific resource, including integrated data from complementary assay platforms, such as mass spectrometry and gene tagging analysis, is discussed.
View details for DOI 10.1074/mcp.M111.013458
View details for PubMedID 22042635
View details for PubMedCentralID PMC3316724
Characterization of MRFAP1 turnover and interactions downstream of the NEDD8 pathway.
Molecular & cellular proteomics : MCP
2012; 11 (3): M111.014407
The NEDD8-Cullin E3 ligase pathway plays an important role in protein homeostasis, in particular the degradation of cell cycle regulators and transcriptional control networks. To characterize NEDD8-cullin target proteins, we performed a quantitative proteomic analysis of cells treated with MLN4924, a small molecule inhibitor of the NEDD8 conjugation pathway. MRFAP1 and its interaction partner, MORF4L1, were among the most up-regulated proteins after NEDD8 inhibition in multiple human cell lines. We show that MRFAP1 has a fast turnover rate in the absence of MLN4924 and is degraded via the ubiquitin-proteasome system. The increased abundance of MRFAP1 after MLN4924 treatment results from a decreased rate of degradation. Characterization of the binding partners of both MRFAP1 and MORF4L1 revealed a complex protein-protein interaction network. MRFAP1 bound to a number of E3 ubiquitin ligases, including CUL4B, but not to components of the NuA4 complex, including MRGBP, which bound to MORF4L1. These data indicate that MRFAP1 may regulate the ability of MORF4L1 to interact with chromatin-modifying enzymes by binding to MORF4L1 in a mutually exclusive manner with MRGBP. Analysis of MRFAP1 expression in human tissues by immunostaining with a MRFAP1-specific antibody revealed that it was detectable in only a small number of tissues, in particular testis and brain. Strikingly, analysis of the seminiferous tubules of the testis showed the highest nuclear staining in the spermatogonia and much weaker staining in the spermatocytes and spermatids. MRGBP was inversely correlated with MRFAP1 expression in these cell types, consistent with an exchange of MORF4L1 interaction partners as cells progress through meiosis in the testis. These data highlight an important new arm of the NEDD8-cullin pathway.
View details for DOI 10.1074/mcp.M111.014407
View details for PubMedID 22038470
View details for PubMedCentralID PMC3316733
Systematic analysis of protein pools, isoforms, and modifications affecting turnover and subcellular localization.
Molecular & cellular proteomics : MCP
2012; 11 (3): M111.013680
In higher eukaryotes many genes encode protein isoforms whose properties and biological roles are often poorly characterized. Here we describe systematic approaches for detection of either distinct isoforms, or separate pools of the same isoform, with differential biological properties. Using information from ion intensities we have estimated protein abundance levels and using rates of change in stable isotope labeling with amino acids in cell culture isotope ratios we measured turnover rates and subcellular distribution for the HeLa cell proteome. Protein isoforms were detected using three data analysis strategies that evaluate differences between stable isotope labeling with amino acids in cell culture isotope ratios for specific groups of peptides within the total set of peptides assigned to a protein. The candidate approach compares stable isotope labeling with amino acids in cell culture isotope ratios for predicted isoform-specific peptides, with ratio values for peptides shared by all the isoforms. The rule of thirds approach compares the mean isotope ratio values for all peptides in each of three equal segments along the linear length of the protein, assessing differences between segment values. The three in a row approach compares mean isotope ratio values for each sequential group of three adjacent peptides, assessing differences with the mean value for all peptides assigned to the protein. Protein isoforms were also detected and their properties evaluated by fractionating cell extracts on one-dimensional SDS-PAGE prior to trypsin digestion and MS analysis and independently evaluating isotope ratio values for the same peptides isolated from different gel slices. The effect of protein phosphorylation on turnover rates was analyzed by comparing mean turnover values calculated for all peptides assigned to a protein, either including, or excluding, values for cognate phosphopeptides. Collectively, these experimental and analytical approaches provide a framework for expanding the functional annotation of the genome.
View details for DOI 10.1074/mcp.M111.013680
View details for PubMedID 22002106
View details for PubMedCentralID PMC3316725
A Protein Epitope Signature Tag (PrEST) library allows SILAC-based absolute quantification and multiplexed determination of protein copy numbers in cell lines.
Molecular & cellular proteomics : MCP
2012; 11 (3): O111.009613
Mass spectrometry-based proteomics increasingly relies on relative or absolute quantification. In relative quantification, stable isotope based methods often allow mixing at early stages of sample preparation, whereas for absolute quantification this has generally required recombinant expression of full length, labeled protein standards. Here we make use of a very large library of Protein Epitope Signature Tags (PrESTs) that has been developed in the course of the Human Protein Atlas Project. These PrESTs are expressed recombinantly in E. coli and they consist of a short and unique region of the protein of interest as well as purification and solubility tags. We first quantify a highly purified, stable isotope labeling of amino acids in cell culture (SILAC)-labeled version of the solubility tag and use it determine the precise amount of each PrEST by its SILAC ratios. The PrESTs are then spiked into cell lysates and the SILAC ratios of PrEST peptides to peptides from endogenous target proteins yield their cellular quantities. The procedure can readily be multiplexed, as we demonstrate by simultaneously determining the copy number of 40 proteins in HeLa cells. Among the proteins analyzed, the cytoskeletal protein vimentin was found to be most abundant with 20 million copies per cell, while the transcription factor and oncogene FOS only had 6000 copies. Direct quantification of the absolute amount of single proteins is possible via a SILAC experiment in which labeled cell lysate is mixed both with the heavy labeled solubility tag and with the corresponding PrEST. The SILAC-PrEST combination allows accurate and streamlined quantification of the absolute or relative amount of proteins of interest in a wide variety of applications.
View details for DOI 10.1074/mcp.O111.009613
View details for PubMedID 21964433
View details for PubMedCentralID PMC3316735
Generation of monospecific antibodies based on affinity capture of polyclonal antibodies.
Protein science : a publication of the Protein Society
2011; 20 (11): 1824-35
A method is described to generate and validate antibodies based on mapping the linear epitopes of a polyclonal antibody followed by sequential epitope-specific capture using synthetic peptides. Polyclonal antibodies directed towards four proteins RBM3, SATB2, ANLN, and CNDP1, potentially involved in human cancers, were selected and antibodies to several non-overlapping epitopes were generated and subsequently validated by Western blot, immunohistochemistry, and immunofluorescence. For all four proteins, a dramatic difference in functionality could be observed for these monospecific antibodies directed to the different epitopes. In each case, at least one antibody was obtained with full functionality across all applications, while other epitope-specific fractions showed no or little functionality. These results present a path forward to use the mapped binding sites of polyclonal antibodies to generate epitope-specific antibodies, providing an attractive approach for large-scale efforts to characterize the human proteome by antibodies.
View details for DOI 10.1002/pro.716
View details for PubMedID 21898641
View details for PubMedCentralID PMC3267947
Mapping the subcellular protein distribution in three human cell lines.
Journal of proteome research
2011; 10 (8): 3766-77
The subcellular locations of proteins are closely related to their function and constitute an essential aspect for understanding the complex machinery of living cells. A systematic effort has been initiated to map the protein distribution in three functionally different cell lines with the aim to provide a subcellular localization index for at least one representative protein from all human protein-encoding genes. Here, we present the results of more than 3500 proteins mapped to 16 subcellular compartments. The results indicate a ubiquitous protein expression with a majority of the proteins found in all three cell lines and a large portion localized to two or more compartments. The inter-relationships between the subcellular compartments are visualized in a protein-compartment network based on all detected proteins. Hierarchical clustering was performed to determine how closely related the organelles are in terms of protein constituents and compare the proteins detected in each cell type. Our results show distinct organelle proteomes, well conserved across the cell types, and demonstrate that biochemically similar organelles are grouped together.
View details for DOI 10.1021/pr200379a
View details for PubMedID 21675716
SATB2 in combination with cytokeratin 20 identifies over 95% of all colorectal carcinomas.
The American journal of surgical pathology
2011; 35 (7): 937-48
The special AT-rich sequence-binding protein 2 (SATB2), a nuclear matrix-associated transcription factor and epigenetic regulator, was identified as a tissue type-specific protein when screening protein expression patterns in human normal and cancer tissues using an antibody-based proteomics approach. In this respect, the SATB2 protein shows a selective pattern of expression and, within cells of epithelial lineages, SATB2 expression is restricted to glandular cells lining the lower gastrointestinal tract. The expression of SATB2 protein is primarily preserved in cancer cells of colorectal origin, indicating that SATB2 could function as a clinically useful diagnostic marker to distinguish colorectal cancer (CRC) from other types of cancer. The aim of this study was to further explore and validate the specific expression pattern of SATB2 as a clinical biomarker and to compare SATB2 with the well-known cytokeratin 20 (CK20). Immunohistochemistry was used to analyze the extent of SATB2 expression in tissue microarrays with tumors from 9 independent cohorts of patients with primary and metastatic CRCs (n=1882). Our results show that SATB2 is a sensitive and highly specific marker for CRC with distinct positivity in 85% of all CRCs, and that SATB2 and/or CK20 was positive in 97% of CRCs. In conclusion, the specific expression of SATB2 in a large majority of CRCs suggests that SATB2 can be used as an important complementary tool for the differential diagnosis of carcinoma of unknown primary origin.
View details for DOI 10.1097/PAS.0b013e31821c3dae
View details for PubMedID 21677534
Novel asymmetrically localizing components of human centrosomes identified by complementary proteomics methods.
The EMBO journal
2011; 30 (8): 1520-35
Centrosomes in animal cells are dynamic organelles with a proteinaceous matrix of pericentriolar material assembled around a pair of centrioles. They organize the microtubule cytoskeleton and the mitotic spindle apparatus. Mature centrioles are essential for biogenesis of primary cilia that mediate key signalling events. Despite recent advances, the molecular basis for the plethora of processes coordinated by centrosomes is not fully understood. We have combined protein identification and localization, using PCP-SILAC mass spectrometry, BAC transgeneOmics, and antibodies to define the constituents of human centrosomes. From a background of non-specific proteins, we distinguished 126 known and 40 candidate centrosomal proteins, of which 22 were confirmed as novel components. An antibody screen covering 4000 genes revealed an additional 113 candidates. We illustrate the power of our methods by identifying a novel set of five proteins preferentially associated with mother or daughter centrioles, comprising genes implicated in cell polarity. Pulsed labelling demonstrates a remarkable variation in the stability of centrosomal protein complexes. These spatiotemporal proteomics data provide leads to the further functional characterization of centrosomal proteins.
View details for DOI 10.1038/emboj.2011.63
View details for PubMedID 21399614
View details for PubMedCentralID PMC3102290
Selection and characterisation of affibody molecules inhibiting the interaction between Ras and Raf in vitro.
2010; 27 (6): 766-73
Development of molecules with the ability to selectively inhibit particular protein-protein interactions is important in providing tools for understanding cell biology. In this work, we describe efforts to select small Ras- and Raf-specific three-helix bundle affibody binding proteins capable of inhibiting the interaction between H-Ras and Raf-1, from a combinatorial library displayed on bacteriophage. Target-specific variants with typically high nanomolar or low micromolar affinities (K(D)) could be selected successfully against both proteins, as shown by dot blot, ELISA and real-time biospecific interaction analyses. Affibody molecule variants selected against H-Ras were shown to bind epitopes overlapping each other at a site that differed from that at which H-Ras interacts with Raf-1. In contrast, an affibody molecule isolated during selection against Raf-1 was shown to effectively inhibit the interaction between H-Ras and Raf-1 in a dose-dependent manner. Possible intracellular applications of the selected affibody molecules are discussed.
View details for DOI 10.1016/j.nbt.2010.07.016
View details for PubMedID 20674812
Defining the transcriptome and proteome in three functionally different human cell lines.
Molecular systems biology
2010; 6: 450
An essential question in human biology is how cells and tissues differ in gene and protein expression and how these differences delineate specific biological function. Here, we have performed a global analysis of both mRNA and protein levels based on sequence-based transcriptome analysis (RNA-seq), SILAC-based mass spectrometry analysis and antibody-based confocal microscopy. The study was performed in three functionally different human cell lines and based on the global analysis, we estimated the fractions of mRNA and protein that are cell specific or expressed at similar/different levels in the cell lines. A highly ubiquitous RNA expression was found with >60% of the gene products detected in all cells. The changes of mRNA and protein levels in the cell lines using SILAC and RNA ratios show high correlations, even though the genome-wide dynamic range is substantially higher for the proteins as compared with the transcripts. Large general differences in abundance for proteins from various functional classes are observed and, in general, the cell-type specific proteins are low abundant and highly enriched for cell-surface proteins. Thus, this study shows a path to characterize the transcriptome and proteome in human cells from different origins.
View details for DOI 10.1038/msb.2010.106
View details for PubMedID 21179022
View details for PubMedCentralID PMC3018165
Analysis of transcript and protein overlap in a human osteosarcoma cell line.
2010; 11: 684
An interesting field of research in genomics and proteomics is to compare the overlap between the transcriptome and the proteome. Recently, the tools to analyse gene and protein expression on a whole-genome scale have been improved, including the availability of the new generation sequencing instruments and high-throughput antibody-based methods to analyze the presence and localization of proteins. In this study, we used massive transcriptome sequencing (RNA-seq) to investigate the transcriptome of a human osteosarcoma cell line and compared the expression levels with in situ protein data obtained in-situ from antibody-based immunohistochemistry (IHC) and immunofluorescence microscopy (IF).A large-scale analysis based on 2749 genes was performed, corresponding to approximately 13% of the protein coding genes in the human genome. We found the presence of both RNA and proteins to a large fraction of the analyzed genes with 60% of the analyzed human genes detected by all three methods. Only 34 genes (1.2%) were not detected on the transcriptional or protein level with any method. Our data suggest that the majority of the human genes are expressed at detectable transcript or protein levels in this cell line. Since the reliability of antibodies depends on possible cross-reactivity, we compared the RNA and protein data using antibodies with different reliability scores based on various criteria, including Western blot analysis. Gene products detected in all three platforms generally have good antibody validation scores, while those detected only by antibodies, but not by RNA sequencing, generally consist of more low-scoring antibodies.This suggests that some antibodies are staining the cells in an unspecific manner, and that assessment of transcript presence by RNA-seq can provide guidance for validation of the corresponding antibodies.
View details for DOI 10.1186/1471-2164-11-684
View details for PubMedID 21126332
View details for PubMedCentralID PMC3014981
- Towards a knowledge-based Human Protein Atlas. Nature biotechnology 2010; 28 (12): 1248-50
Creation of an antibody-based subcellular protein atlas.
2010; 10 (22): 3984-96
An important part for understanding the complex machinery of living cells is to know the spatial distribution of proteins all the way from organ to organelle levels. An equally important part of proteomics is to map the subcellular distribution of all human proteins. Here, we discuss methodologies for systematic subcellular profiling with emphasis on the antibody-based approach performed as a part of the Human Protein Atlas project. The considerations made when creating the subcellular protein atlas and critical parameters of this approach are discussed.
View details for DOI 10.1002/pmic.201000125
View details for PubMedID 20648481
Subcellular distribution and expression of prenylated Rab acceptor 1 domain family, member 2 (PRAF2) in malignant glioma: Influence on cell survival and migration.
2010; 101 (7): 1624-31
Our previous studies revealed that the expression of the 19-kDa protein prenylated Rab acceptor 1 domain family, member 2 (PRAF2) is elevated in cancer tissues of the breast, colon, lung, and ovary, when compared to noncancerous tissues of paired samples. PRAF2 mRNA expression also correlated with several genetic and clinical features and is a candidate prognostic marker in the pediatric cancer neuroblastoma. The PRAF2-related proteins, PRAF1 and PRAF3, play multiple roles in cellular processes, including endo/exocytic vesicle trafficking and glutamate uptake. PRAF2 shares a high sequence homology with these family members, but its function remains unknown. In this study, we examined PRAF2 mRNA and protein expression in 20 different human cancer types using Affymetrix microarray and human tissue microarray (TMA) analyses, respectively. In addition, we investigated the subcellular distribution of PRAF2 by immunofluorescence microscopy and cell fractionation studies. PRAF2 mRNA and protein expression was elevated in several cancer tissues with highest levels in malignant glioma. At the molecular level, we detected native PRAF2 in small, vesicle-like structures throughout the cytoplasm as well as in and around cell nuclei of U-87 malignant glioma cells. We further found that monomeric and dimeric forms of PRAF2 are associated with different cell compartments, suggesting possible functional differences. Importantly, PRAF2 down-regulation by RNA interference significantly reduced the cell viability, migration, and invasiveness of U-87 cells. This study shows that PRAF2 expression is elevated in various tumors with exceptionally high expression in malignant gliomas, and PRAF2 therefore presents a candidate molecular target for therapeutic intervention.
View details for DOI 10.1111/j.1349-7006.2010.01570.x
View details for PubMedID 20412121
A single fixation protocol for proteome-wide immunofluorescence localization studies.
Journal of proteomics
2010; 73 (6): 1067-78
Immunofluorescence microscopy is a valuable tool for analyzing protein expression and localization at a subcellular level thus providing information regarding protein function, interaction partners and its role in cellular processes. When performing sample fixation, parameters such as difference in accessibility of proteins present in various cellular compartments as well as the chemical composition of the protein to be studied, needs to be taken into account. However, in systematic and proteome-wide efforts, a need exists for standard fixation protocol(s) that works well for the majority of all proteins independent of subcellular localization. Here, we report on a study with the goal to find a standardized protocol based on the analysis of 18 human proteins localized in 11 different organelles and subcellular structures. Six fixation protocols were tested based on either dehydration by alcohols (methanol, ethanol or iso-propanol) or cross-linking by paraformaldehyde followed by detergent permeabilization (Triton X-100 or saponin) in three human cell lines. Our results show that cross-linking is essential for proteome-wide localization studies and that cross-linking using paraformaldehyde followed by Triton X-100 permeabilization successfully can be used as a single fixation protocol for systematic studies.
View details for DOI 10.1016/j.jprot.2009.10.012
View details for PubMedID 19896565
Selection of affibody molecules to the ligand-binding site of the insulin-like growth factor-1 receptor.
Biotechnology and applied biochemistry
2010; 55 (2): 99-109
Affibody molecules binding to the site of hormone interaction in IGF-1R (insulin-like growth factor-1 receptor) were successfully selected by phage-display technology employing a competitive-elution strategy during biopanning, whereby release of receptor-bound phagemids was accomplished by competition with IGF-1 (insulin-like growth factor-1). In non-competitive selections, the elution of receptor-bound phagemids was performed by imidazole or low-pH incubation, which also resulted in the isolation of affibody molecules that could bind to the receptor. An ELISA-based assay showed that the affibody molecules generated by IGF-1 competition during elution, in addition to affibody molecules generated in the non-competitive selections, could compete with IGF-1 for binding to the receptor. The affinities of the isolated variants to IGF-1R-overexpressing MCF-7 cells were determined and ranged from high nanomolar to 2.3 nM. The most promising variant, Z4:40, was shown to recognize IGF-1R efficiently in several different contexts: in analyses based on flow cytometry, fluorescence microscopy and receptor pull-down from cell extracts. In addition, when Z4:40 was added to the medium of MCF-7 cells that were dependent on IGF-1 for efficient growth, it was found to have a dose-dependent growth-inhibitory effect on the cells. Applications of affibody-based reagents for quantitative and qualitative analyses of IGF-1R status, as well as applications of affibody-based reagents for therapy, are discussed.
View details for DOI 10.1042/BA20090226
View details for PubMedID 20088825
A global view of protein expression in human cells, tissues, and organs.
Molecular systems biology
2009; 5: 337
Defining the protein profiles of tissues and organs is critical to understanding the unique characteristics of the various cell types in the human body. In this study, we report on an anatomically comprehensive analysis of 4842 protein profiles in 48 human tissues and 45 human cell lines. A detailed analysis of over 2 million manually annotated, high-resolution, immunohistochemistry-based images showed a high fraction (>65%) of expressed proteins in most cells and tissues, with very few proteins (<2%) detected in any single cell type. Similarly, confocal microscopy in three human cell lines detected expression of more than 70% of the analyzed proteins. Despite this ubiquitous expression, hierarchical clustering analysis, based on global protein expression patterns, shows that the analyzed cells can be still subdivided into groups according to the current concepts of histology and cellular differentiation. This study suggests that tissue specificity is achieved by precise regulation of protein levels in space and time, and that different tissues in the body acquire their unique characteristics by controlling not which proteins are expressed but how much of each is produced.
View details for DOI 10.1038/msb.2009.93
View details for PubMedID 20029370
View details for PubMedCentralID PMC2824494
Affibody-mediated retention of the epidermal growth factor receptor in the secretory compartments leads to inhibition of phosphorylation in the kinase domain.
2009; 25 (6): 417-23
Abnormal activity of the epidermal growth factor receptor (EGFR) is associated with various cancer-related processes and motivates the search for strategies that can selectively block EGFR signalling. In this study, functional knockdown of EGFR was achieved through expression of an affibody construct, (ZEGFR:1907)(2-)KDEL, with high affinity for EGFR and extended with the amino acids KDEL to make it resident in the secretory compartments. Expression of (ZEGFR:1907)(2-)KDEL resulted in 80% reduction ofthe cell surface level of EGFR, and fluorescent staining for EGFR and the (ZEGFR:1907)(2-)KDEL construct showed overlapping intracellular localisation. Immunocapture of EGFR from cell lysates showed that an intracellular complex between EGFR and the affibody construct had been formed, further indicating aspecific interaction between the affibody construct and EGFR. Surface depletion of EGFR led to a dramatic decrease in the amount of kinase domain phosphorylated EGFR, coincident with a significant decrease in the proliferation rate.
View details for DOI 10.1016/j.nbt.2009.02.001
View details for PubMedID 19552886
Selective expression of Syntaxin-7 protein in benign melanocytes and malignant melanoma.
Journal of proteome research
2009; 8 (4): 1639-46
To search for proteins expressed in human melanocytes and melanoma, we employed an antibody-based proteomics strategy to screen for protein expression in tissue microarrays containing normal tissues, cancer tissues and cell lines. Syntaxin-7 (STX7) was identified as a novel protein, not previously characterized in cells of melanocytic lineage, displaying a cell type-specific protein expression pattern. In tumor tissues, STX7 was expressed in malignant melanoma and lymphoma. The protein was further characterized regarding subcellular localization, specificity, tissue distribution pattern and potential as a diagnostic and prognostic marker using cell lines and tissue microarrays containing normal skin, melanocytic nevi and primary and metastatic melanoma. STX7 was expressed in normal melanocytes, various benign melanocytic nevi, atypical nevi and malignant melanoma. Analysis in two independent melanoma cohorts demonstrated STX7 expression in nearly all investigated tumors, although at varying levels (> 90% positive tumors). The expression level of STX7 protein was inversely correlated to tumor stage, suggesting that decreased expression of STX7 is associated with more aggressive tumors. In conclusion, we present protein profiling data for a novel protein showing high sensitivity and specificity for cells of the melanocytic lineage. The presented antibody-based proteomics approach can be used as an effective strategy to identify novel tumor markers and evaluate their potential clinical relevance.
View details for DOI 10.1021/pr800745e
View details for PubMedID 19714869
Selection and characterization of Affibody ligands to the transcription factor c-Jun.
Biotechnology and applied biochemistry
2009; 52 (Pt 1): 17-27
c-Jun is a highly oncogenic transcription factor involved in the development of different types of cancer. In the present study we have generated c-Jun-binding-affinity proteins from a phage-displayed library of so-called 'Affibody ligands', developed by combinatorial engineering of a non-immunoglobulin-based scaffold protein. Homodimeric c-Jun protein was recombinantly produced in Escherichia coli and, prior to selection, the quality of the target protein was investigated by binding analyses, which indicated specific binding to a double-stranded DNA hairpin construct containing a c-Jun response element, but not to a control sequence. Isolated Affibody variants from the phage selection were expressed in E. coli, purified by affinity chromatography and their interaction with c-Jun was analysed. In biosensor analyses, one Affibody ligand, denoted Z(cJun518), was shown to interact with immobilized c-Jun protein with an apparent dissociation constant of 5 microM. By constructing a head-to-tail homodimeric version of Z(cJun518), its apparent affinity for c-Jun could be increased threefold, suggesting co-operativity effects in the binding to the immobilized c-Jun protein. Further characterization of the Z(cJun518) Affibody molecule demonstrated, in both affinity-capture and Western-blotting experiments, its ability to interact selectively with c-Jun, even when the c-Jun target was present in a complex protein background consisting of a bacterial cell lysate. Z(cJun518) could also be used to stain the c-Jun-overexpressing cell line C8161 visualized by confocal fluorescence microscopy. Results from competition experiments indicated that the binding epitope on c-Jun for the Z(cJun518) Affibody molecule was separate from the binding sites of both a polyclonal antibody raised against the unstructured N-terminal domain and a double-stranded DNA hairpin containing a c-Jun response element. The potential intracellular use of Affibody ligands directed against transcription factors and other oncogenic factors is discussed.
View details for DOI 10.1042/BA20070178
View details for PubMedID 18260830
Automated Analysis of Human Protein Atlas Immunofluorescence Images.
Proceedings. IEEE International Symposium on Biomedical Imaging
2009; 5193229: 1023-1026
The Human Protein Atlas is a rich source of location proteomics data. In this work, we present an automated approach for processing and classifying major subcellular patterns in the Atlas images. We demonstrate that two different classification frameworks (support vector machine and random forest) are effective at determining subcellular locations; we can analyze over 3500 Atlas images with a high degree of accuracy, up to 87.5% for all of the samples and 98.5% when only considering samples in whose classification assignments we are most confident. Moreover, the features obtained in both of these frameworks are observed to be highly consistent and generalizable. Additionally, we observe that the features relating the proteins to cell markers are especially important in automated learning approaches.
View details for DOI 10.1109/ISBI.2009.5193229
View details for PubMedID 20628548
View details for PubMedCentralID PMC2901900
The correlation between cellular size and protein expression levels--normalization for global protein profiling.
Journal of proteomics
2008; 71 (4): 448-60
An automated image analysis system was used for protein quantification of 1862 human proteins in 47 cancer cell lines and 12 clinical cell samples using cell microarrays and immunohistochemistry. The analysis suggests that most proteins are expressed in a cell size dependent manner, and that normalization is required for comparative protein quantification in order to correct for the inherent bias of cell size and systematic ambiguities associated with immunohistochemistry. Two reference standards were evaluated, and normalized protein expression values were found to allow for protein profiling across a panel of morphologically diverse cells, revealing putative patterns of over- and underexpression. Using this approach, proteins with stable expression as well as cell-line specific expression were identified. The results demonstrate the value of large-scale, automated proteome analysis using immunohistochemistry, in revealing functional correlations and establishing methods to interpret and mine proteomic data.
View details for DOI 10.1016/j.jprot.2008.06.014
View details for PubMedID 18656560
A genecentric Human Protein Atlas for expression profiles based on antibodies.
Molecular & cellular proteomics : MCP
2008; 7 (10): 2019-27
An attractive path forward in proteomics is to experimentally annotate the human protein complement of the genome in a genecentric manner. Using antibodies, it might be possible to design protein-specific probes for a representative protein from every protein-coding gene and to subsequently use the antibodies for systematical analysis of cellular distribution and subcellular localization of proteins in normal and disease tissues. A new version (4.0) of the Human Protein Atlas has been developed in a genecentric manner with the inclusion of all human genes and splice variants predicted from genome efforts together with a visualization of each protein with characteristics such as predicted membrane regions, signal peptide, and protein domains and new plots showing the uniqueness (sequence similarity) of every fraction of each protein toward all other human proteins. The new version is based on tissue profiles generated from 6120 antibodies with more than five million immunohistochemistry-based images covering 5067 human genes, corresponding to approximately 25% of the human genome. Version 4.0 includes a putative list of members in various protein classes, both functional classes, such as kinases, transcription factors, G-protein-coupled receptors, etc., and project-related classes, such as candidate genes for cancer or cardiovascular diseases. The exact antigen sequence for the internally generated antibodies has also been released together with a visualization of the application-specific validation performed for each antibody, including a protein array assay, Western blot analysis, immunohistochemistry, and, for a large fraction, immunofluorescence-based confocal microscopy. New search functionalities have been added to allow complex queries regarding protein expression profiles, protein classes, and chromosome location. The new version of the protein atlas thus is a resource for many areas of biomedical research, including protein science and biomarker discovery.
View details for DOI 10.1074/mcp.R800013-MCP200
View details for PubMedID 18669619
Affinity-based entrapment of the HER2 receptor in the endoplasmic reticulum using an affibody molecule.
Journal of immunological methods
2008; 338 (1-2): 1-6
Interference with the export of cell surface receptors can be performed through co-expression of specific affinity molecules designed for entrapment in the endoplasmic reticulum during the export process. We describe the investigation of a small (6 kDa) non-immunoglobulin-based HER2 receptor binding affibody molecule (Z(HER2:00477)), for use in affinity mediated entrapment of the HER2 receptor in the ER. Constructs encoding Z(HER2:00477) or a control affibody protein, with or without ER-retention peptide extensions (KDEL), were expressed in the HER2 over-expressing cell line SKOV-3. Intracellular expression of the full-length affibody constructs could be confirmed by probing cell extracts by Western blotting. Confocal immunofluorescence microscopy experiments showed extensive co-localization of the HER2 receptor and Z(HER2:00477)-KDEL in the ER, whereas the use of a KDEL-extended control affibody molecule resulted in distinct and separate signals from cell surface-localized HER2 receptor and ER-localized affibody protein. This indicated a capability of the Z(HER2:00477)-KDEL fusion protein to functionally interfere with the export process of HER2 receptor in a specific manner. Using flow cytometry and cell proliferation analyses, it could be shown that expression of the Z(HER2:00477)-KDEL fusion construct in the SKOV-3 cell line resulted both in a marked reduction in cell surface level of HER2 receptors and that the cell population doubling time was significantly increased. Expression of the Z(HER2:00477)-KDEL fusion protein in additional cell lines of different origin and with different expression levels of endogenous HER2 receptor compared to SKOV-3, also resulted in depletion of the cell surface levels of HER2 receptor. This indicated upon a general ability of the Z(HER2:00477)-KDEL fusion protein to functionally interfere with the export process of HER2.
View details for DOI 10.1016/j.jim.2008.06.005
View details for PubMedID 18671978
Toward a confocal subcellular atlas of the human proteome.
Molecular & cellular proteomics : MCP
2008; 7 (3): 499-508
Information on protein localization on the subcellular level is important to map and characterize the proteome and to better understand cellular functions of proteins. Here we report on a pilot study of 466 proteins in three human cell lines aimed to allow large scale confocal microscopy analysis using protein-specific antibodies. Approximately 3000 high resolution images were generated, and more than 80% of the analyzed proteins could be classified in one or multiple subcellular compartment(s). The localizations of the proteins showed, in many cases, good agreement with the Gene Ontology localization prediction model. This is the first large scale antibody-based study to localize proteins into subcellular compartments using antibodies and confocal microscopy. The results suggest that this approach might be a valuable tool in conjunction with predictive models for protein localization.
View details for DOI 10.1074/mcp.M700325-MCP200
View details for PubMedID 18029348
A novel method for reproducible fluorescent labeling of small amounts of antibodies on solid phase.
Journal of immunological methods
2007; 322 (1-2): 40-9
Fluorescently labeled antibodies are very important tools in cell biology, providing for specific and quantitative detection of antigens. To date, fluorophore labeling of antibodies has been performed in solution and has been limited by low-throughput methods requiring a substantial amount of pure antibody sample at a high concentration. We have developed a novel solid-phase labeling protocol for small amounts (i.e. micrograms) of antibodies with fluorescent dyes. Protein A affinity medium was used as solid support in a micropipette tip format. This solid-phase approach, including the advantage of the strong and specific interaction between Protein A and antibodies, allows for simultaneous purification, labeling and concentration of the antibody sample, making it possible to start with unpure antibody samples at low concentrations. We have optimized the protocol with regard to reaction pH, time, temperature and amount of amine reactive dye. In addition, we have evaluated the stability and activity of the labeled antibodies. To evaluate the reproducibility and robustness of this method we labeled eight antibodies with amine reactive fluorescent dyes followed by evaluation of antibody specificity on protein arrays. Interestingly, this gave an extremely high conformity in the degree of labeling, showing the robustness of the method. The solid-phase method also gave predictable and reproducible results and by varying the amount of reactive dye, the desired degree of labeling can easily be achieved. Antibodies labeled using this solid-phase method were similar in stability and activity to antibodies labeled in solution. This novel solid-phase antibody labeling method may also be applicable for other conjugation chemistries and labels, and has potential for high-throughput applications.
View details for DOI 10.1016/j.jim.2007.01.023
View details for PubMedID 17383674
Site-specifically conjugated anti-HER2 Affibody molecules as one-step reagents for target expression analyses on cells and xenograft samples.
Journal of immunological methods
2007; 319 (1-2): 53-63
Affibody molecules are a class of small and robust affinity proteins that can be generated to interact with a variety of antigens, thus having the potential to provide useful tools for biotechnological research and diagnostic applications. In this study, we have investigated Affibody-based reagents interacting specifically with the tyrosine kinase receptor HER2. A head-to-tail dimeric construct was site-specifically conjugated with different fluorescent and enzymatic groups resulting in reagents that were used for detection and quantification. The amount of cell surface expressed HER2 on eleven (11) well characterized cell lines was quantified relative to each other by flow cytometry and shown to correlate well with results from parallel analyses of HER2 mRNA levels measured by real-time PCR. Further, immunofluorescence microscopy studies of the cell lines and immunohistochemical analyses of cryosections of HER2 expressing SKOV-3 xenografts showed strong staining of the plasma membrane of tumor cells with little background staining. Full-length HER2 protein could also be efficiently recovered from a cell extract by an immunoprecipitation procedure, using an Affibody ligand-based resin. These novel non-IgG derived reagents could be used to detect and quantify HER2 expression. By adapting the methods for use with Affibody molecules binding to other cell surface receptors, it is anticipated that also these receptors can be detected and quantified in a similar manner.
View details for DOI 10.1016/j.jim.2006.10.013
View details for PubMedID 17196217