Abstract
Proteomic analysis of cells, tissues and body fluids has generated valuable insights into the complex processes influencing human biology. Proteins represent intermediate phenotypes for disease and provide insight into how genetic and non-genetic risk factors are mechanistically linked to clinical outcomes. Associations between protein levels and DNA sequence variants that colocalize with risk alleles for common diseases can expose disease-associated pathways, revealing novel drug targets and translational biomarkers. However, genome-wide, population-scale analyses of proteomic data are only now emerging. Here, we review current findings from studies of the plasma proteome and discuss their potential for advancing biomedical translation through the interpretation of genome-wide association analyses. We highlight the challenges faced by currently available technologies and provide perspectives relevant to their future application in large-scale biobank studies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
£14.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
£139.00 per year
only £11.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
MacArthur, J. et al. The new NHGRI-EBI catalog of published genome-wide association studies (GWAS catalog). Nucleic Acids Res. 45, D896–D901 (2017).
Lonsdale, J. et al. The genotype-tissue expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Suhre, K. et al. Human metabolic individuality in biomedical and pharmaceutical research. Nature 477, 54–60 (2011).
Kastenmuller, G., Raffler, J., Gieger, C. & Suhre, K. Genetics of human metabolism: an update. Hum. Mol. Genet. 24, R93–R101 (2015).
Anderson, N. L. & Anderson, N. G. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell. Proteomics 1, 845–867 (2002).
Melzer, D. et al. A genome-wide association study identifies protein quantitative trait loci (pQTLs). PLoS Genet. 4, e1000072 (2008).
Plenge, R. M., Scolnick, E. M. & Altshuler, D. Validating therapeutic targets through human genetics. Nat. Rev. Drug Discov. 12, 581–594 (2013).
Suhre, K. et al. Connecting genetic risk to disease end points through the human blood plasma proteome. Nat. Commun. 8, 14357 (2017). This is one of the first GWAS using the SomaScan platform for 1,100 proteins.
Emilsson, V. et al. Co-regulatory networks of human serum proteins link genetics to disease. Science 361, 769–773 (2018). This is currently the largest GWAS using the updated SomaScan platform for 4,000 proteins and 4,000 samples.
Sun, B. B. et al. Genomic atlas of the human plasma proteome. Nature 558, 73–79 (2018). This is a recent GWAS using the SomaScan platform with 3,000 proteins on 3,000 samples.
Benson, M. D. et al. Genetic architecture of the cardiovascular risk proteome. Circulation 137, 1158–1172 (2018).
Zhernakova, D. V. et al. Individual variations in cardiovascular-disease-related protein levels are driven by genetics and gut microbiome. Nat. Genet. 50, 1524–1532 (2018).
Yao, C. et al. Genome-wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease. Nat. Commun. 9, 3268 (2018).
Enroth, S., Johansson, A., Enroth, S. B. & Gyllensten, U. Strong effects of genetic and lifestyle factors on biomarker variation and use of personalized cutoffs. Nat. Commun. 5, 4684 (2014). This is an early GWAS using the Olink platform; the study highlights the potential impact of epitope effects on protein readouts.
Lourdusamy, A. et al. Identification of cis-regulatory variation influencing protein abundance levels in human plasma. Hum. Mol. Genet. 21, 3719–26 (2012).
Sasayama, D. et al. Genome-wide quantitative trait loci mapping of the human cerebrospinal fluid proteome. Hum. Mol. Genet. 26, 44–51 (2017).
Sun, W. et al. Common genetic polymorphisms influence blood biomarker measurements in COPD. PLoS Genet. 12, e1006011 (2016).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018). This study highlights the potential of large biobanks.
German National Cohort (GNC) Consortium. The German National Cohort: aims, study design and organization. Eur. J. Epidemiol. 29, 371–82 (2014).
Precision Medicine Initiative (PMI) Working Group Report to the Advisory Committee to the Director, NIH. The Precision Medicine Initiative Cohort Program – Building a Research Foundation for 21st Century Medicine (National Institutes of Health, 2015).
Chen, Z. et al. China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. Int. J. Epidemiol. 40, 1652–1666 (2011).
Omenn, G. S. et al. Progress on identifying and characterizing the human proteome: 2018 metrics from the HUPO Human Proteome Project. J. Proteome Res. 17, 4031–4041 (2018).
Baker, M. S. et al. Accelerating the search for the missing proteins in the human proteome. Nat. Commun. 8, 14271 (2017).
Aebersold, R. & Mann, M. Mass spectrometry-based proteomics. Nature 422, 198–207 (2003).
Stoevesandt, O. & Taussig, M. J. Affinity proteomics: the role of specific binding reagents in human proteome analysis. Expert. Rev. Proteom. 9, 401–14 (2012).
Smith, J. G. & Gerszten, R. E. Emerging affinity-based proteomic technologies for large-scale plasma profiling in cardiovascular disease. Circulation 135, 1651–1664 (2017).
Timp, W. & Timp, G. Beyond mass spectrometry, the next step in proteomics. Sci. Adv. 6, eaax8978 (2020).
Kim, M. S. et al. A draft map of the human proteome. Nature 509, 575–81 (2014).
Wilhelm, M. et al. Mass-spectrometry-based draft of the human proteome. Nature 509, 582–587 (2014).
Uhlen, M. et al. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Uhlen, M. et al. A genome-wide transcriptomic analysis of protein-coding genes in human blood cells. Science 366, eaax9198 (2019).
Thul, P. J. et al. A subcellular map of the human proteome. Science 356, eaal3321 (2017).
Schwenk, J. M. et al. The human plasma proteome draft of 2017: building on the Human Plasma PeptideAtlas from mass spectrometry and complementary assays. J. Proteome Res. 16, 4299–4310 (2017). This article reviews recent advances in plasma proteomics and uses data from the community to summarize the circulating proteins detected by MS.
Pernemalm, M. et al. In-depth human plasma proteome analysis captures tissue proteins and transfer of protein variants across the placenta. Elife 8, e41608 (2019).
Uhlen, M. et al. The human secretome. Sci Signal 12, eaaz0274 (2019). This article reviews the actively secreted proteins of the human proteome for their destination and reveals that only approximately 730 proteins are secreted into the circulation.
Geyer, P. E. et al. Plasma proteome profiling to detect and avoid sample-related biases in biomarker studies. EMBO Mol. Med. 11, e10427 (2019).
Aebersold, R. & Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature 537, 347–55 (2016).
Marx, V. A dream of single-cell proteomics. Nat. Methods 16, 809–812 (2019).
Aebersold, R. et al. How many human proteoforms are there? Nat. Chem. Biol. 14, 206–214 (2018).
Theodoratou, E. et al. The role of glycosylation in IBD. Nat. Rev. Gastroenterol. Hepatol. 11, 588–600 (2014).
Ignjatovic, V. et al. Mass spectrometry-based plasma proteomics: considerations from sample collection to achieving translational data. J. Proteome. Res. 18, 4085–497 (2019).
Enroth, S., Hallmans, G., Grankvist, K. & Gyllensten, U. Effects of long-term storage time and original sampling month on biobank plasma protein concentrations. EBioMedicine 12, 309–314 (2016).
Kofanova, O. et al. IL8 and IL16 levels indicate serum and plasma quality. Clin. Chem. Lab. Med. 56, 1054–1062 (2018).
Qundos, U. et al. Profiling post-centrifugation delay of serum and plasma with antibody bead arrays. J. Proteom. 95, 46–54 (2013).
Daniels, J. R. et al. Stability of the human plasma proteome to pre-analytical variability as assessed by an aptamer-based approach. J. Proteome. Res. 18, 3661–3670 (2019).
Kim, C. H. et al. Stability and reproducibility of proteomic profiles measured with an aptamer-based platform. Sci. Rep. 8, 8382 (2018).
Shen, Q. et al. Strong impact on plasma protein profiles by precentrifugation delay but not by repeated freeze-thaw cycles, as analyzed using multiplex proximity extension assays. Clin. Chem. Lab. Med. 56, 582–594 (2018).
Di Girolamo, F., Alessandroni, J., Somma, P. & Guadagni, F. Pre-analytical operating procedures for serum low molecular Weight protein profiling. J. Proteom. 73, 667–77 (2010).
Zimmerman, L. J., Li, M., Yarbrough, W. G., Slebos, R. J. & Liebler, D. C. Global stability of plasma proteomes for mass spectrometry-based analyses. Mol. Cell. Proteomics 11, M111.014340 (2012).
Shen, Y. et al. Characterization of the human blood plasma proteome. Proteomics 5, 4034–45 (2005).
Abbatiello, S. E. et al. Large-scale interlaboratory study to develop, analytically validate and apply highly multiplexed, quantitative peptide assays to measure cancer-relevant proteins in plasma. Mol. Cell. Proteomics 14, 2357–74 (2015).
Harney, D. J. et al. Small-protein enrichment assay enables the rapid, unbiased analysis of over 100 low abundance factors from human plasma. Mol. Cell. Proteomics 18, 1899–1915 (2019).
Johansson, A. et al. Identification of genetic variants influencing the human plasma proteome. Proc. Natl Acad. Sci. USA 110, 4673–8 (2013).
Geyer, P. E., Holdt, L. M., Teupser, D. & Mann, M. Revisiting biomarker discovery by plasma proteomics. Mol. Syst. Biol. 13, 942 (2017).
Keshishian, H. et al. Multiplexed, quantitative workflow for sensitive biomarker discovery in plasma yields novel candidates for early myocardial injury. Mol. Cell. Proteomics 14, 2375–93 (2015).
Ludwig, C. et al. Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial. Mol. Syst. Biol. 14, e8126 (2018).
Doerr, A. Mass spectrometry-based targeted proteomics. Nat. Methods 10, 23 (2013).
Geyer, P. E. et al. Plasma proteome profiling to assess human health and disease. Cell Syst. 2, 185–95 (2016).
Geyer, P. E. et al. Proteomics reveals the effects of sustained weight loss on the human plasma proteome. Mol. Syst. Biol. 12, 901 (2016).
Liu, Y. et al. Quantitative variability of 342 plasma proteins in a human twin population. Mol. Syst. Biol. 11, 786 (2015).
Rosenberger, G. et al. Inference and quantification of peptidoforms in large sample cohorts by SWATH-MS. Nat. Biotechnol. 35, 781–788 (2017).
Bruderer, R. et al. Analysis of 1508 plasma samples by capillary-flow data-independent acquisition profiles proteomics of weight loss and maintenance. Mol. Cell. Proteomics 18, 1242–1254 (2019).
Addona, T. A. et al. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat. Biotechnol. 27, 633–41 (2009).
Percy, A. J. et al. Method and platform standardization in MRM-based quantitative plasma proteomics. J. Proteom. 95, 66–76 (2013).
Stoevesandt, O. & Taussig, M. J. Affinity reagent resources for human proteome detection: initiatives and perspectives. Proteomics 7, 2738–50 (2007).
Ekins, R. P. Multi-analyte immunoassay. J. Pharm. Biomed. Anal. 7, 155–68 (1989).
Ayoglu, B. et al. Systematic antibody and antigen-based proteomic profiling with microarrays. Expert Rev. Mol. Diagn. 11, 219–34 (2011).
Rissin, D. M. et al. Single-molecule enzyme-linked immunosorbent assay detects serum proteins at subfemtomolar concentrations. Nat. Biotechnol. 28, 595–9 (2010).
Fulton, R. J., McDade, R. L., Smith, P. L., Kienker, L. J. & Kettman, J. R. Jr. Advanced multiplexed analysis with the FlowMetrix system. Clin. Chem. 43, 1749–56 (1997).
Ahola-Olli, A. V. et al. Genome-wide association study identifies 27 loci influencing concentrations of circulating cytokines and growth factors. Am. J. Hum. Genet. 100, 40–50 (2017).
Fredolini, C. et al. Immunocapture strategies in translational proteomics. Expert Rev. Proteom. 13, 83–98 (2016).
Assarsson, E. et al. Homogenous 96-plex PEA immunoassay exhibiting high sensitivity, specificity, and excellent scalability. PLoS ONE 9, e95192 (2014).
Folkersen, L. et al. Mapping of 79 loci for 83 plasma protein biomarkers in cardiovascular disease. PLoS Genet. 13, e1006706 (2017).
Folkersen, L. et al. Genomic evaluation of circulating proteins for drug target characterisation and precision medicine. Preprint at bioRxiv https://doi.org/10.1101/2020.04.03.023804 (2020). This is currently one of the largest pQTL studies, with more than 21,000 samples on a 92-protein panel from the Olink platform.
Gold, L. et al. Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE 5, e15004 (2010).
Williams, S. A. et al. Plasma protein patterns as comprehensive indicators of health. Nat. Med. 25, 1851–1857 (2019).
Lam, M. P. et al. Data-driven approach to determine popular proteins for targeted proteomics translation of six organ systems. J. Proteome Res. 15, 4126–4134 (2016).
Colwill, K. & Graslund, S. A roadmap to generate renewable protein binders to the human proteome. Nat. Methods 8, 551–8 (2011).
Baker, M. Reproducibility crisis: blame it on the antibodies. Nature 521, 274–6 (2015).
Uhlen, M. et al. A proposal for validation of antibodies. Nat. Methods 13, 823–7 (2016).
Fredolini, C. et al. Systematic assessment of antibody selectivity in plasma based on a resource of enrichment profiles. Sci. Rep. 9, 8324 (2019).
Edfors, F. et al. Enhanced validation of antibodies for research applications. Nat. Commun. 9, 4130 (2018).
Aulchenko, Y. S., Ripke, S., Isaacs, A. & van Duijn, C. M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–6 (2007).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–75 (2007).
Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–7 (2012).
Ruffieux, H., Davison, A. C., Hager, J. & Irincheeva, I. Efficient inference for genetic association studies with multiple outcomes. Biostatistics 18, 618–636 (2017).
Ahsan, M. et al. The relative contribution of DNA methylation and genetic variants on protein biomarkers for human diseases. PLOS Genet. 13, e1007005 (2017).
de Vries, P. S. et al. Whole-genome sequencing study of serum peptide levels: the Atherosclerosis Risk in Communities study. Hum. Mol. Genet. 26, 3442–3450 (2017).
Graumann, J. et al. Multi-platform affinity proteomics identify proteins linked to metastasis and immune suppression in ovarian cancer plasma. Front. Oncol. 9, 1150 (2019).
Billing, A. M. et al. Complementarity of SOMAscan to LC-MS/MS and RNA-seq for quantitative profiling of human embryonic and mesenchymal stem cells. J. Proteom. 150, 86–97 (2017).
Ruffieux, H. et al. A Bayesian joint pQTL study sheds light on the genetic architecture of obesity. Preprint at bioRxiv https://doi.org/10.1101/524405 (2019).
Freedman, M. L. et al. Principles for the post-GWAS functional characterization of cancer risk loci. Nat. Genet. 43, 513–8 (2011).
Gamazon, E. R. et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 50, 956–967 (2018).
Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599 (2019).
Nieuwenhuis, T. O. et al. Consistent RNA sequencing contamination in GTEx and other data sets. Nat. Commun. 11, 1933 (2020).
Zheng, J. et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Preprint at bioRxiv https://doi.org/10.1101/627398 (2019).
Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife 7, e34408 (2018).
Petersen, A. K. et al. On the hypothesis-free testing of metabolite ratios in genome-wide and metabolome-wide association studies. BMC Bioinformatics 13, 120 (2012).
Slenter, D. N. et al. WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research. Nucleic Acids Res. 46, D661–D667 (2018).
Szklarczyk, D. et al. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res. 45, D362–D368 (2017).
Krumsiek, J., Suhre, K., Illig, T., Adamski, J. & Theis, F. J. Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC Syst. Biol. 5, 21 (2011).
Shin, S. Y. et al. An atlas of genetic influences on human blood metabolites. Nat. Genet. 46, 543–550 (2014).
van der Harst, P. & Verweij, N. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ. Res. 122, 433–443 (2018).
Klarin, D., Emdin, C. A., Natarajan, P., Conrad, M. F. & Kathiresan, S. Genetic analysis of venous thromboembolism in UK Biobank identifies the ZFPM2 locus and implicates obesity as a causal risk factor. Circ. Cardiovasc. Genet. 10, e001643 (2017).
Nath, A. P. et al. Multivariate genome-wide association analysis of a cytokine network reveals variants with widespread immune, haematological, and cardiometabolic pleiotropy. Am. J. Hum. Genet. 105, 1076–1090 (2019).
Do, K. T., Rasp, D. J. N., Kastenmuller, G., Suhre, K. & Krumsiek, J. MoDentify: phenotype-driven module identification in metabolomics networks at different resolutions. Bioinformatics 35, 532–534 (2019).
Zhang, B. et al. Proteogenomic characterization of human colon and rectal cancer. Nature 513, 382–7 (2014).
Nesvizhskii, A. I. Proteogenomics: concepts, applications and computational strategies. Nat. Methods 11, 1114–25 (2014).
Ting, Y. S. et al. PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data. Nat. Methods 14, 903–908 (2017).
Harper, S. C. et al. Is growth differentiation factor 11 a realistic therapeutic for aging-dependent muscle defects? Circ. Res. 118, 1143–50 (2016).
SomaLogic. Short Technical Note: Characterization of the Binding Specificity of SOMAmer Reagents in the SomaScan Assay (2019).
Ganz, P. et al. Development and validation of a protein-based risk score for cardiovascular outcomes among patients with stable coronary heart disease. JAMA 315, 2532–41 (2016).
Anderson, N. L. The clinical plasma proteome: a survey of clinical assays for proteins in plasma and serum. Clin. Chem. 56, 177–85 (2010). This is an early survey that lists the FDA-approved plasma biomarkers (an update of this list is provided in Supplementary Table 1).
Sjaarda, J. et al. Influence of genetic ancestry on human serum proteome. Am. J. Hum. Genet. 106, 303–314 (2020).
Staley, J. R. et al. PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics 32, 3207–3209 (2016).
Arnold, M., Raffler, J., Pfeufer, A., Suhre, K. & Kastenmuller, G. SNiPA: an interactive, genetic variant-centered annotation browser. Bioinformatics 31, 1334–6 (2015).
He, X. et al. Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. Am. J. Hum. Genet. 92, 667–80 (2013).
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Richardson, T. G., Harrison, S., Hemani, G. & Davey Smith, G. An atlas of polygenic risk score associations to highlight putative causal relationships across the human phenome. Elife 8, e43657 (2019).
Mosley, J. D. et al. Probing the virtual proteome to identify novel disease biomarkers. Circulation 138, 2469–2481 (2018).
Udler, M. S. et al. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis. PLoS Med. 15, e1002654 (2018).
Plump, A. & Davey Smith, G. Identifying and validating new drug targets for stroke and beyond. Circulation 140, 831–835 (2019).
Chong, M. et al. Novel drug targets for ischemic stroke identified through mendelian randomization analysis of the blood proteome. Circulation 140, 819–830 (2019).
Hillary, R. F. et al. Genome and epigenome wide studies of neurological protein biomarkers in the Lothian Birth Cohort 1936. Nat. Commun. 10, 3160 (2019).
Shen, X. et al. Multivariate discovery and replication of five novel loci associated with immunoglobulin G N-glycosylation. Nat. Commun. 8, 447 (2017).
Sharapov, S. Z. et al. Defining the genetic control of human blood plasma N-glycome using genome-wide association study. Hum. Mol. Genet. 28, 2062–2077 (2019).
Lin, Y. H., Zhu, J., Meijer, S., Franc, V. & Heck, A. J. R. Glycoproteogenomics: a frequent gene polymorphism affects the glycosylation pattern of the human serum fetuin/alpha-2-HS-glycoprotein. Mol. Cell. Proteomics 18, 1479–1490 (2019).
Zaghlool, S. B. et al. Epigenetics meets proteomics in an epigenome-wide association study with circulating blood plasma protein traits. Nat. Commun. 11, 15 (2020).
Huan, T. et al. Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nat. Commun. 10, 4267 (2019).
Zaghlool, S. B. et al. Deep molecular phenotypes link complex disorders and physiological insult to CpG methylation. Hum. Mol. Genet. 27, 1106–1121 (2018).
Suhre, K. et al. Fine-mapping of the human blood plasma n-glycome onto its proteome. Metabolites 9 (2019).
Gudmundsdottir, V. et al. Circulating protein signatures and causal candidates for type 2 diabetes. Diabetes https://doi.org/10.2337/db19-1070 (2020).
Lehallier, B. et al. Undulating changes in human plasma proteome profiles across the lifespan. Nat. Med. 25, 1843–1850 (2019).
Kim, S. et al. Influence of genetic variation on plasma protein levels in older adults using a multi-analyte panel. PLoS ONE 8, e70269 (2013).
Kauwe, J. S. et al. Genome-wide association study of CSF levels of 59 Alzheimer’s disease candidate proteins: significant associations with proteins involved in amyloid processing and inflammation. PLoS Genet. 10, e1004758 (2014).
Deming, Y. et al. Genetic studies of plasma analytes identify novel potential biomarkers for several complex traits. Sci. Rep. 6, 18092 (2016).
Solomon, T. et al. Associations between common and rare exonic genetic variants and serum levels of 20 cardiovascular-related proteins: the Tromso study. Circ. Cardiovasc. Genet. 9, 375–83 (2016).
Di Narzo, A. F. et al. High-throughput characterization of blood serum proteomics of ibd patients with respect to aging and genetic factors. PLoS Genet. 13, e1006565 (2017).
Carayol, J. et al. Protein quantitative trait locus study in obesity during weight-loss identifies a leptin regulator. Nat. Commun. 8, 2084 (2017).
Solomon, T. et al. Identification of common and rare genetic variation associated with plasma protein levels using whole-exome sequencing and mass spectrometry. Circ. Genom. Precis. Med. 11, e002170 (2018).
Sliz, E. et al. Genome-wide association study identifies seven novel loci associating with circulating cytokines and cell adhesion molecules in Finns. J. Med. Genet. 56, 607–616 (2019).
Gilly, A. et al. Whole genome sequencing analysis of the cardiometabolic proteome. Preprint at bioRxiv https://doi.org/10.1101/854752 (2020).
Orru, V. et al. Genetic variants regulating immune cell levels in health and disease. Cell 155, 242–56 (2013).
Patin, E. et al. Natural variation in the parameters of innate immune cells is preferentially driven by genetic factors. Nat. Immunol. 19, 302–314 (2018).
Acknowledgements
K.S. is supported by the Biomedical Research Program at Weill Cornell Medicine in Qatar, a programme funded by the Qatar Foundation. J.M.S. is supported by the KTH Center for Applied Precision Medicine funded by the Erling Persson Family Foundation and acknowledges the Knut and Alice Wallenberg Foundation for funding the Human Protein Atlas. J.M.S. and M.I.M. acknowledge the Innovative Medicines Initiative Joint Undertaking under grant agreement no. 115317 (DIRECT), the resources of which are composed of a financial contribution from the European Union’s Seventh Framework Programme and an EFPIA companies’ in kind contribution. The views expressed in this article are those of the authors and not necessarily those of the UK NHS, the UK NIHR, the UK Department of Health or the Qatar Foundation.
Author information
Authors and Affiliations
Contributions
K.S. and J.M.S. researched data for article. All authors contributed to the discussion of content, writing the article and reviewing/editing the manuscript before submission.
Corresponding authors
Ethics declarations
Competing interests
M.I.M. has served on advisory panels for Pfizer, NovoNordisk and Zoe Global, has received honoraria from Merck, Pfizer, NovoNordisk and Eli Lilly, has stock options in Zoe Global and has received research funding from AbbVie, AstraZeneca, Boehringer Ingelheim, Eli Lilly, Janssen, Merck, NovoNordisk, Pfizer, Roche, Sanofi Aventis, Servier and Takeda. As of June 2019, M.I.M. is an employee of Genentech and holds stock in Roche. K.S. and J.M.S. declare no competing interests.
Additional information
Peer review information
Nature Reviews Genetics thanks M. Altelaar and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
A table of all published GWAS with proteomics: http://www.metabolomix.com/a-table-of-all-published-gwas-with-proteomics/
Human Plasma Proteome Project: https://www.hupo.org/plasma-proteome-project
Human Proteome Map: http://www.humanproteomemap.org
In vitro test systems that have been categorized by the FDA: https://www.fda.gov/medical-devices/medical-device-databases/clinical-laboratory-improvement-amendments-download-data
MRbase: a database and analytical platform for Mendelian randomization: http://www.mrbase.org
neXtProt, a knowledgebase on human proteins: https://www.nextprot.org
PeptideAtlas: http://www.peptideatlas.org
pGWAS server: connecting genetic risk to disease end points through the human blood plasma proteome: http://proteomics.gwas.eu
PhenoScanner: a database of human genotype-phenotype associations: http://www.phenoscanner.medschl.cam.ac.uk
ProteomeXchange, a consortium to provide coordinated data submission and dissemination of proteomics repositories: http://www.proteomexchange.org
ProteomicsDB: https://www.proteomicsdb.org
SNiPA: a tool for annotating and browsing genetic variants: http://snipa.org
SomaLogic white paper on SOMAmer specificity: https://somalogic.com/technology/our-platform/somamer-specificity/
The Genome Aggregation Database (gnomAD) browser: https://gnomad.broadinstitute.org/
The Genotype-Tissue Expression (GTEx) project portal: https://gtexportal.org
The Human Protein Atlas: https://www.proteinatlas.org
The Human Protein Atlas: the proteins actively secreted to human blood: https://www.proteinatlas.org/humanproteome/blood/secreted+to+blood
The NHGRI-EBI catalogue of published genome-wide association studies: https://www.ebi.ac.uk/gwas
Uniprot Proteomes — Homo sapiens: https://www.uniprot.org/proteomes/UP000005640
Supplementary information
Glossary
- Colocalization
-
Two genetic associations are said to be colocalized if the strengths of their statistical associations covary at a genetic locus, suggesting a shared genetic causal variant for the observed associations.
- Protein QTLs
-
(pQTLs). A protein quantitative trait locus (pQTL) is an association of protein levels at a genetic locus; it is often represented by the strongest associating single-nucleotide polymorphism.
- pQTL studies
-
Genome-wide association studies where the dependent variables are the levels of proteins measured using a proteomics approach. The identified loci that associate with protein levels are termed ‘protein quantitative trait loci’ (pQTLs).
- Open reading frames
-
Portions of DNA that can be translated into protein and that are terminated by a stop codon.
- Post-translational modifications
-
Biochemical modification of the primary peptide sequence, typically by covalent addition of a chemical group, such as for phosphorylation and glycosylation. Post-translational modifications can change the accessibility to a protein epitope and potentially influence the binding of affinity reagents.
- Data-dependent acquisition
-
(DDA). A data acquisition mode used in mass spectrometry analysis where only a selected set of peptides with the most intense peptide ions are being fragmented and analysed.
- Data-independent acquisition
-
(DIA). A data acquisition mode used in mass spectrometry analysis where all peptides detected within a particular window of the mass-to-charge ratio are being fragmented and analysed.
- Aptamers
-
Short single-stranded (and possibly modified) nucleotides that are selected from a synthetic library of sequences to recognize a specific target protein (for example, via structural elements) with high affinity.
- cis-pQTLs
-
When a protein quantitative trait locus (pQTL) is at or near the genetic locus that encodes the associated protein; often an ad hoc distance cut-off is used to differentiate cis-pQTLs from trans-pQTLs. A cis-pQTL suggests a direct influence of a genetic variant at that locus on protein expression or turnover.
- trans-pQTLs
-
When a protein quantitative trait locus (pQTL) is distant from the protein-coding gene or on another chromosome. A trans-pQTL indicates an indirect link between the genetic locus and protein expression or turnover.
- Linkage disequilibrium
-
Two genetic loci are in linkage disequilibrium if their genotypes correlate within a population. Lack of recombination between loci results in them commonly being co-inherited as a haplotype.
- Mendelian randomization
-
A method to estimate the unconfounded effect of an exposure (for example, protein level) on an outcome (for example, disease risk) using genetic variation.
- Gaussian graphical models
-
(GGMs). Network representations of the partial correlations between a set of quantitative variables, here the protein levels. Partial correlations used in a protein GGM can be viewed as the amount of pairwise correlation between the levels of two proteins that remains when the contributions of all other proteins are accounted for.
- Pleiotropic
-
A genetic locus is pleiotropic when one or more of its variants is associated with two or more seemingly unrelated phenotypic traits.
- Epitope effect
-
An effect of an epitope-changing variant on the binding properties of affinity reagents with regard to their antigens. A difference in reported antigen recognition may be mistaken for a difference in protein abundance.
- Polygenic risk scores
-
Combined risk scores derived from a weighed combination of genetic associations, possibly including millions of associations.
Rights and permissions
About this article
Cite this article
Suhre, K., McCarthy, M.I. & Schwenk, J.M. Genetics meets proteomics: perspectives for large population-based studies. Nat Rev Genet 22, 19–37 (2021). https://doi.org/10.1038/s41576-020-0268-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41576-020-0268-2