Resource
Published: 12 February 2024

Integrated multiomic profiling of breast cancer in the Chinese population reveals patient stratification and therapeutic vulnerabilities

Nature Cancer volume 5, pages 673–690 (2024)Cite this article

6692 Accesses
9 Citations
13 Altmetric
Metrics details

Subjects

Abstract

Molecular profiling guides precision treatment of breast cancer; however, Asian patients are underrepresented in publicly available large-scale studies. We established a comprehensive multiomics cohort of 773 Chinese patients with breast cancer and systematically analyzed their genomic, transcriptomic, proteomic, metabolomic, radiomic and digital pathology characteristics. Here we show that compared to breast cancers in white individuals, Asian individuals had more targetable AKT1 mutations. Integrated analysis revealed a higher proportion of HER2-enriched subtype and correspondingly more frequent ERBB2 amplification and higher HER2 protein abundance in the Chinese HR⁺HER2⁺ cohort, stressing anti-HER2 therapy for these individuals. Furthermore, comprehensive metabolomic and proteomic analyses revealed ferroptosis as a potential therapeutic target for basal-like tumors. The integration of clinical, transcriptomic, metabolomic, radiomic and pathological features allowed for efficient stratification of patients into groups with varying recurrence risks. Our study provides a public resource and new insights into the biology and ancestry specificity of breast cancer in the Asian population, offering potential for further precision treatment approaches.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Multiomics landscape of the CBCGA cohort.**

**Fig. 2: Ancestry-specific molecular features of breast cancers in Chinese patients.**

**Fig. 3: Proteogenomic profiling yields new insights into breast cancer subtypes.**

**Fig. 4: Systematic evaluation of metabolic dysregulation with polar metabolomics and lipidomics.**

**Fig. 5: Immunogenomic analysis deciphered the heterogeneity of the TME in breast cancer.**

**Fig. 6: Multimodal data integration using machine learning for risk stratification of breast cancer.**

Molecular features and clinical implications of the heterogeneity in Chinese patients with HER2-low breast cancer

Article Open access 22 August 2023

Identification of HER2-positive breast cancer molecular subtypes with potential clinical implications in the ALTTO clinical trial

Article Open access 29 November 2024

Comprehensive omic characterization of breast cancer in Mexican-Hispanic women

Article Open access 14 April 2021

Data availability

WES, CNA, RNA-seq and metabolome data that support the findings of this study have been deposited in the Genome Sequence Archive database under accession code PRJCA017539. MS data have been deposited in iProX under accession code IPX0006535000. Human breast cancer genomic, transcriptomic data and protein data were derived from the FUSCC targeted sequencing cohort, TCGA Research Network, Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Clinical Proteomic Tumor Analysis Consortium (CPTAC). The datasets derived from TCGA, METABRIC and CPTAC are available at the cBioPortal website (www.cbioportal.org/). FUSCC targeted sequencing data are available in the Fudan Data Portal (https://data.3steps.cn/cdataportal/). All other data supporting the findings of this study are available from the corresponding author on reasonable request. Source data are provided with this paper.

Code availability

All data analysis and processing were conducted using published software packages whose details have been previously described and referenced within the Methods. No new code or mathematical algorithms were generated from this manuscript.

References

Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).
Article PubMed Google Scholar
Waks, A. G. & Winer, E. P. Breast cancer treatment: a review. JAMA 321, 288–300, (2019).
Article CAS PubMed Google Scholar
Gennari, A. et al. ESMO clinical practice guideline for the diagnosis, staging and treatment of patients with metastatic breast cancer. Ann. Oncol. https://doi.org/10.1016/j.annonc.2021.09.019 (2021).
Article PubMed Google Scholar
Razavi, P. et al. The genomic landscape of endocrine-resistant advanced breast cancers. Cancer Cell 34, 427–438 (2018).
Article CAS PubMed PubMed Central Google Scholar
Jiang, Y. Z. et al. Genomic and transcriptomic landscape of triple-negative breast cancers: subtypes and treatment strategies. Cancer Cell 35, 428–440 (2019).
Article CAS PubMed Google Scholar
Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
Article Google Scholar
Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346–352 (2012).
Article CAS PubMed PubMed Central Google Scholar
Ciriello, G. et al. Comprehensive molecular portraits of invasive lobular breast cancer. Cell 163, 506–519 (2015).
Article CAS PubMed PubMed Central Google Scholar
Sammut, S. J. et al. Multi-omic machine learning predictor of breast cancer therapy response. Nature 601, 623–629 (2022).
Article CAS PubMed Google Scholar
Boehm, K. M. et al. Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer. Nat. Cancer 3, 723–733 (2022).
Article CAS PubMed PubMed Central Google Scholar
Mertins, P. et al. Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534, 55–62 (2016).
Article CAS PubMed PubMed Central Google Scholar
Krug, K. et al. Proteogenomic landscape of breast cancer tumourigenesis and targeted therapy. Cell 183, 1436–1456 (2020).
Article CAS PubMed PubMed Central Google Scholar
Pan, J. W. et al. The molecular landscape of Asian breast cancers reveals clinically relevant population-specific differences. Nat. Commun. 11, 6433 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kan, Z. et al. Multi-omics profiling of younger Asian breast cancers reveals distinctive molecular signatures. Nat. Commun. 9, 1725 (2018).
Article PubMed PubMed Central Google Scholar
Shimoi, T. et al. Hotspot mutation profiles of AKT1 in Asian women with breast and endometrial cancers. BMC Cancer 21, 1131 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lang, G. T. et al. Characterization of the genomic landscape and actionable mutations in Chinese breast cancers by clinical sequencing. Nat. Commun. 11, 5679 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lee, Y. R. et al. WWP1 gain-of-function inactivation of PTEN in cancer predisposition. N. Engl. J. Med. 382, 2103–2116 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lee, Y. R. et al. Reactivation of PTEN tumour suppressor for cancer treatment through inhibition of a MYC-WWP1 inhibitory pathway. Science https://doi.org/10.1126/science.aau0159 (2019).
Wang, B. et al. Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11, 333–337 (2014).
Article CAS PubMed Google Scholar
Wolf, D. M. et al. Redefining breast cancer subtypes to guide treatment prioritization and maximize response: predictive biomarkers across 10 cancer therapies. Cancer Cell https://doi.org/10.1016/j.ccell.2022.05.005 (2022).
Hakimi, A. A. et al. An integrated metabolic atlas of clear cell renal cell carcinoma. Cancer Cell 29, 104–116 (2016).
Article CAS PubMed PubMed Central Google Scholar
Xiao, Y. et al. Comprehensive metabolomics expands precision medicine for triple-negative breast cancer. Cell Res 32, 477–490 (2022).
Article CAS PubMed PubMed Central Google Scholar
Xiao, Y. et al. Multi-omics profiling reveals distinct microenvironment characterization and suggests immune escape mechanisms of triple-negative breast cancer. Clin. Cancer Res. 25, 5002–5014 (2019).
Article CAS PubMed Google Scholar
Pusztai, L. et al. Durvalumab with olaparib and paclitaxel for high-risk HER2-negative stage II/III breast cancer: results from the adaptively randomized I-SPY2 trial. Cancer Cell https://doi.org/10.1016/j.ccell.2021.05.009 (2021).
Article PubMed Google Scholar
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457 (2015).
Article CAS PubMed PubMed Central Google Scholar
Yoshihara, K. et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612 (2013).
Article PubMed Google Scholar
Wu, S. Z. et al. A single-cell and spatially resolved atlas of human breast cancers. Nat. Genet. 53, 1334–1347 (2021).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. J. et al. Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods. Genome Biol. 20, 213 (2019).
Article PubMed PubMed Central Google Scholar
Uhrig, S. et al. Accurate and efficient detection of gene fusions from RNA sequencing data. Genome Res. 31, 448–460 (2021).
Article PubMed PubMed Central Google Scholar
Ding, R. et al. Breast cancer screening and early diagnosis in Chinese women. Cancer Biol. Med. https://doi.org/10.20892/j.issn.2095-3941.2021.0676 (2022).
Article PubMed PubMed Central Google Scholar
Lee, S. K. et al. Is the high proportion of young age at breast cancer onset a unique feature of Asian breast cancer? Breast Cancer Res. Treat. 173, 189–199 (2019).
Article PubMed Google Scholar
Zhu, B. et al. Comparison of somatic mutation landscapes in Chinese versus European breast cancer patients. HGG Adv. 3, 100076 (2022).
CAS PubMed Google Scholar
Wander, S. A. et al. The genomic landscape of intrinsic and acquired resistance to cyclin-dependent kinase 4/6 inhibitors in patients with hormone receptor-positive metastatic breast cancer. Cancer Discov. 10, 1174–1193 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kalinsky, K. et al. Effect of capivasertib in patients with an AKT1 E17K-mutated tumour: NCI-MATCH subprotocol EAY131-Y nonrandomized trial. JAMA Oncol. 7, 271–278, (2021).
Article PubMed Google Scholar
Smyth, L. M. et al. Capivasertib, an AKT kinase inhibitor, as monotherapy or in combination with fulvestrant in patients with AKT1 (E17K)-mutant, ER-positive metastatic breast cancer. Clin. Cancer Res. 26, 3947–3957 (2020).
Article CAS PubMed PubMed Central Google Scholar
Jones, R. H. et al. Fulvestrant plus capivasertib versus placebo after relapse or progression on an aromatase inhibitor in metastatic, oestrogen receptor-positive breast cancer (FAKTION): a multicentre, randomised, controlled, phase 2 trial. Lancet Oncol. 21, 345–357 (2020).
Article CAS PubMed PubMed Central Google Scholar
Gianni, L. et al. Efficacy and safety of neoadjuvant pertuzumab and trastuzumab in women with locally advanced, inflammatory, or early HER2-positive breast cancer (NeoSphere): a randomised multicentre, open-label, phase 2 trial. Lancet Oncol. 13, 25–32 (2012).
Article CAS PubMed Google Scholar
Robidoux, A. et al. Lapatinib as a component of neoadjuvant therapy for HER2-positive operable breast cancer (NSABP protocol B-41): an open-label, randomised phase 3 trial. Lancet Oncol. 14, 1183–1192 (2013).
Article CAS PubMed Google Scholar
de Azambuja, E. et al. Lapatinib with trastuzumab for HER2-positive early breast cancer (NeoALTTO): survival outcomes of a randomised, open-label, multicentre, phase 3 trial and their association with pathological complete response. Lancet Oncol. 15, 1137–1146 (2014).
Article PubMed Google Scholar
Gianni, L. et al. Neoadjuvant chemotherapy with trastuzumab followed by adjuvant trastuzumab versus neoadjuvant chemotherapy alone, in patients with HER2-positive locally advanced breast cancer (the NOAH trial): a randomised controlled superiority trial with a parallel HER2-negative cohort. Lancet 375, 377–384 (2010).
Article CAS PubMed Google Scholar
Shao, Z. et al. Efficacy, safety, and tolerability of pertuzumab, trastuzumab, and docetaxel for patients with early or locally advanced ERBB2-positive breast cancer in Asia: the PEONY Phase 3 randomized clinical trial. JAMA Oncol. 6, e193692 (2020).
Article PubMed Google Scholar
Llombart-Cussac, A. et al. HER2-enriched subtype as a predictor of pathological complete response following trastuzumab and lapatinib without chemotherapy in early-stage HER2-positive breast cancer (PAMELA): an open-label, single-group, multicentre, phase 2 trial. Lancet Oncol. 18, 545–554 (2017).
Article CAS PubMed Google Scholar
Prat, A. et al. HER2-enriched subtype and ERBB2 expression in HER2-positive breast cancer treated with dual HER2 blockade. J. Natl Cancer Inst. 112, 46–54 (2020).
Article PubMed Google Scholar
Ciriello, G. et al. Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45, 1127–1133 (2013).
Article CAS PubMed PubMed Central Google Scholar
Denkert, C. et al. Tumour-infiltrating lymphocytes and prognosis in different subtypes of breast cancer: a pooled analysis of 3771 patients treated with neoadjuvant therapy. Lancet Oncol. 19, 40–50 (2018).
Article PubMed Google Scholar
Tang, X. et al. A joint analysis of metabolomics and genetics of breast cancer. Breast Cancer Res. 16, 415 (2014).
Article PubMed PubMed Central Google Scholar
Terunuma, A. et al. MYC-driven accumulation of 2-hydroxyglutarate is associated with breast cancer prognosis. J. Clin. Invest. 124, 398–412 (2014).
Article CAS PubMed Google Scholar
Nguyen, T. et al. Uncovering the role of N-acetyl-aspartyl-glutamate as a glutamate reservoir in cancer. Cell Rep. 27, 491–501 (2019).
Article CAS PubMed PubMed Central Google Scholar
Muthusamy, T. et al. Serine restriction alters sphingolipid diversity to constrain tumour growth. Nature 586, 790–795 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ogretmen, B. Sphingolipid metabolism in cancer signalling and therapy. Nat. Rev. Cancer 18, 33–50 (2018).
Article CAS PubMed Google Scholar
Zheng, J. & Conrad, M. The metabolic underpinnings of ferroptosis. Cell Metab. 32, 920–937 (2020).
Article CAS PubMed Google Scholar
Chen, X., Kang, R., Kroemer, G. & Tang, D. Broadening horizons: the role of ferroptosis in cancer. Nat. Rev. Clin. Oncol. 18, 280–296 (2021).
Article CAS PubMed Google Scholar
Jiang, L. et al. Radiogenomic analysis reveals tumour heterogeneity of triple-negative breast cancer. Cell Rep. Med. 3, 100694 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhao, S. et al. Deep learning framework for comprehensive molecular and prognostic stratifications of triple-negative breast cancer. Fundam. Res. https://doi.org/10.1016/j.fmre.2022.06.008 (2022).
Article Google Scholar
Jiang, Y.-Z. et al. Integrated molecular portraits of breast cancer. Nat. Protoc. https://doi.org/10.21203/rs.3.pex-2435/v1 (2023).
Article PubMed Google Scholar
Parker, J. S. et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27, 1160–1167 (2009).
Article PubMed PubMed Central Google Scholar
Paquet, E. R. & Hallett, M. T. Absolute assignment of breast cancer intrinsic molecular subtype. J. Natl Cancer Inst. https://doi.org/10.1093/jnci/dju357 (2015).
Chen, B., Khodadoust, M. S., Liu, C. L., Newman, A. M. & Alizadeh, A. A. Profiling tumour infiltrating immune cells with CIBERSORT. Methods Mol. Biol. 1711, 243–259 (2018).
Article CAS PubMed PubMed Central Google Scholar
Becht, E. et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 17, 218 (2016).
Article PubMed PubMed Central Google Scholar
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. PNAS 102, 15545–15550 (2005).
Article CAS PubMed PubMed Central Google Scholar
Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).
Article CAS PubMed PubMed Central Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central Google Scholar
Hanzelmann, S., Castelo, R. & Guinney, J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 14, 7 (2013).
Article Google Scholar
Telli, M. L. et al. Homologous recombination deficiency (HRD) score predicts response to platinum-containing neoadjuvant chemotherapy in patients with triple-negative breast cancer. Clin. Cancer Res. 22, 3764–3773 (2016).
Article CAS PubMed PubMed Central Google Scholar
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Article CAS PubMed PubMed Central Google Scholar
Benard, B. A. et al. Clonal architecture predicts clinical outcomes and drug sensitivity in acute myeloid leukemia. Nat. Commun. 12, 7244 (2021).
Article CAS PubMed PubMed Central Google Scholar
Amin, S. B. et al. Comparative molecular life history of spontaneous canine and human gliomas. Cancer Cell 37, 243–257 (2020).
Article CAS PubMed PubMed Central Google Scholar
Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).
Article CAS PubMed PubMed Central Google Scholar
McGranahan, N. et al. Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science 351, 1463–1469 (2016).
Article CAS PubMed PubMed Central Google Scholar
Chen, D. et al. Identification and characterization of robust hepatocellular carcinoma prognostic subtypes based on an integrative metabolite-protein interaction network. Adv. Sci. 8, e2100311 (2021).
Article Google Scholar
Johansson, H. J. et al. Breast cancer quantitative proteome and proteogenomic landscape. Nat. Commun. 10, 1600 (2019).
Article PubMed PubMed Central Google Scholar
Chen, Y. J. et al. Proteogenomics of non-smoking lung cancer in East Asia delineates molecular signatures of pathogenesis and progression. Cell 182, 226–244 (2020).
Article CAS PubMed Google Scholar
Avants, B. B., Epstein, C. L., Grossman, M. & Gee, J. C. Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med. Image Anal. 12, 26–41 (2008).
Article CAS PubMed Google Scholar
Neal, J. T. et al. Organoid modeling of the tumour immune microenvironment. Cell 175, 1972–1988 (2018).
Article CAS PubMed PubMed Central Google Scholar
Sachs, N. et al. A living biobank of breast cancer organoids captures disease heterogeneity. Cell 172, 373–386 (2018).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by grants from the National Key Research and Development Project of China (grant no. 2020YFA0112304 to Z.-M.S. and Y.-Z.J., and 2021YFF1201300 to Y.-Z.J., W.Huang and J.S.), the National Natural Science Foundation of China (grant nos. 92159301, 82341003 and 91959207 to Z.-M.S., 82272822 to Y.-Z.J, 82272704 to D.M. and 32370701 to L.S.), the Shanghai Key Laboratory of Breast Cancer (grant no. 12DZ2260100 to Z.-M.S.), the Shanghai Hospital Development Center Municipal Project for Developing Emerging and Frontier Technology in Shanghai Hospitals (grant no. SHDC12021103 to Z.-M.S.), the Program of Shanghai Academic/Technology Research Leader (grant no. 20XD1421100 to Y.-Z.J.), the Natural Science Foundation of Shanghai (grant no. 22ZR1479200 to Y.-Z.J. and 23ZR1411800 to X.J.), the Shanghai Rising-Star Program (grant no. 23QA1401400 to D.M.), the Youth Talent Program of Shanghai Health Commission (grant no. 2022YQ012 to X.J.) and the Shanghai Municipal Science and Technology Major Project (grant no. 2023SHZDZX02 to L.S.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We are grateful to Computing for the Future at Fudan and the Human Phenome Data Center of Fudan University for computing support. We also thank J. Xu from Nanjing University of Information Science and Technology for editing the manuscript.

Author information

These authors contributed equally: Yi-Zhou Jiang, Ding Ma, Xi Jin, Yi Xiao, Ying Yu, Jinxiu Shi.

Authors and Affiliations

Key Laboratory of Breast Cancer, Department of Breast Surgery, Fudan University Shanghai Cancer Center; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
Yi-Zhou Jiang, Ding Ma, Xi Jin, Yi Xiao, Yi-Fan Zhou, Tong Fu, Cai-Jin Lin, Lei-Jie Dai, Cheng-Lin Liu, Shen Zhao, Guan-Hua Su, Wen-Juan Zhang & Zhi-Ming Shao
State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
Ying Yu, Wanwan Hou, Yaqing Liu, Qingwang Chen, Jingcheng Yang, Naixin Zhang, Leming Shi & Yuanting Zheng
Shanghai-MOST Key Laboratory of Health and Disease Genomics, Shanghai Institute for Biomedical and Pharmaceutical Technologies (SIBPT), Shanghai, China
Jinxiu Shi & Wei Huang
Greater Bay Area Institute of Precision Medicine, Guangzhou, China
Jingcheng Yang
Westlake Omics (Hangzhou) Biotechnology, Hangzhou, China
Wei Liu & Weigang Ge
Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai, China
Wen-Tao Yang
Department of Radiology, Fudan University Shanghai Cancer Center, Shanghai, China
Chao You & Yajia Gu
Division Haematology/Oncology, University of Texas Health Science Center at San Antonio, San Antonio, TX, USA
Virginia Kaklamani
Predictive Oncology Laboratory and Department of Medical Oncology, CRCM, Institut Paoli-Calmettes, Inserm UMR1068, CNRS UMR7258, Aix-Marseille Université, Marseille, France
François Bertucci
The Ohio State University Comprehensive Cancer Center, Columbus, OH, USA
Claire Verschraegen
Department of Bioinformatics and Computational Biology, Genentech, South San Francisco, CA, USA
Anneleen Daemen
Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
Nakul M. Shah & Ting Wang
Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
Nakul M. Shah & Ting Wang
McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
Ting Wang
Westlake Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China
Tiannan Guo
School of Medicine, School of Life Sciences, Westlake University, Hangzhou, China
Tiannan Guo
Research Center for Industries of the Future, Westlake University, Hangzhou, China
Tiannan Guo
International Human Phenome Institutes (Shanghai), Shanghai, China
Leming Shi
Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Charles M. Perou

Authors

Yi-Zhou Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Ding Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xi Jin
View author publications
You can also search for this author in PubMed Google Scholar
Yi Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Ying Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jinxiu Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Fan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Tong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Cai-Jin Lin
View author publications
You can also search for this author in PubMed Google Scholar
Lei-Jie Dai
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Lin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shen Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Guan-Hua Su
View author publications
You can also search for this author in PubMed Google Scholar
Wanwan Hou
View author publications
You can also search for this author in PubMed Google Scholar
Yaqing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qingwang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jingcheng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Naixin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Juan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Weigang Ge
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Tao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chao You
View author publications
You can also search for this author in PubMed Google Scholar
Yajia Gu
View author publications
You can also search for this author in PubMed Google Scholar
Virginia Kaklamani
View author publications
You can also search for this author in PubMed Google Scholar
François Bertucci
View author publications
You can also search for this author in PubMed Google Scholar
Claire Verschraegen
View author publications
You can also search for this author in PubMed Google Scholar
Anneleen Daemen
View author publications
You can also search for this author in PubMed Google Scholar
Nakul M. Shah
View author publications
You can also search for this author in PubMed Google Scholar
Ting Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tiannan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Leming Shi
View author publications
You can also search for this author in PubMed Google Scholar
Charles M. Perou
View author publications
You can also search for this author in PubMed Google Scholar
Yuanting Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Wei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Ming Shao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.-M.S., W.Huang, Y.Z. and Y.-Z.J. outlined the manuscript content. J.S. and W.Hunag performed the genomic sequencing. Y.Y., W.Hou, Y.L., Q.C., J.Y., N.Z., L.S. and Y.Z. performed RNA sequencing and contributed to data processing and analysis. W.L., W.G. and T.G. performed proteomics. S.Z., G.-H.S., W.-T.Y., C.Y. and Y.G. contributed to multimodal data integration. Y.-Z.J., D.M., X.J., Y.-F.Z., T.F., C.-J.L., L.-J.D., C.-L.L. and W.-J.Z. contributed to literature survey, data collection and data analysis. Y.-Z.J., D.M., X.J. and Y.X. prepared the figures and drafted the manuscript, with contributions from all authors. V.K., F.B., C.V., A.D., N.M.S., T.W. and C.M.P. helped with data interpretation and manuscript editing. All authors approved the final manuscript.

Corresponding authors

Correspondence to Yi-Zhou Jiang, Yuanting Zheng, Wei Huang or Zhi-Ming Shao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Cancer thanks Xiaohong Yang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Clinical and molecular characteristics of the Chinese Breast cancer Genome Atlas (CBCGA) cohort.

a, Cohort and omics information. b, The matching information between Immunohistochemistry (IHC) subtypes and PAM50 subtypes is displayed using a confusion matrix in which numbers in the diagonal represent subtype agreement between the two subtyping methods (in n = 752 tumors). Abbreviations for PAM50 subtypes: LumA, luminal A; LumB, luminal B; HER2, HER2-enriched; Basal, basal-like; Normal, normal-like. c, The matching information between AIMS subtypes and PAM50 subtypes is displayed using a confusion matrix (in n = 752 tumors). d, Differentially expressed proteins across PAM50 subtypes. From left to right, differential expression analysis were conducted between Luminal A (n = 56 tumors), Luminal B (n = 77 tumors), HER2-enriched (n = 59 tumors), Basal-like (n = 59 tumors) and the other subtypes (n = 215, 194, 212 and 212 tumors respectively). e, f, Differentially expressed polar metabolites (e) and lipids (f) across PAM50 subtypes. From left to right, differential expression analysis were conducted between Luminal A (n = 119 tumors), Luminal B (n = 144 tumors), HER2-enriched (n = 98 tumors), Basal-like (n = 52 tumors) and other subtypes (n = 324, 299, 345 and 391 tumors respectively). For d-f, two-sided P values were determined by Mann–Whitney U-test and adjusted by the Benjamini–Hochberg procedure. Proteins, polar metabolites and lipids were colored gray if they didn’t meet the criteria that the absolute value of log₂ Fold Change (log2FC) is greater than 1 or FDR < 0.05.

Source data

Extended Data Fig. 2 Comparisons between the breast cancers raised in CBCGA Chinese and the Cancer Genome Atlas (TCGA) white individuals.

a–e, Gene-level somatic mutation frequencies of the IDC cases in the Luminal A (CBCGA: n = 182 tumors; TCGA: n = 229 tumors) (a), Luminal B (CBCGA: n = 180 tumors; TCGA: n = 183 tumors) (b), HER2-enriched (CBCGA: n = 121 tumors; TCGA: n = 35 tumors) (c), Basal-like (CBCGA: n = 83 tumors; TCGA: n = 86 tumors) (d) and Normal-like (CBCGA: n = 41 tumors; TCGA: n = 9 tumors) (e) cohorts. f, AKT1 mutation frequency found in IDC cases in East Asian (CBCGA: n = 624 tumors; Targeted sequencing cohort: n = 3,208 tumors; NCCH: n = 311 tumors) and white individuals (TCGA: n = 474 tumors; METABRIC: n = 1,866 tumors) breast cancer cohorts. ‘*’ denotes the cohorts where PAM50 subtypes are not available, AKT1 mutation frequency in all cases is shown. g, AKT1 mutation sites found in luminal A IDC patients in the CBCGA (upper) and TCGA white individuals (lower) cohorts.

Source data

Extended Data Fig. 3 Comparisons in molecular subtype and ERBB2 amplification between the breast cancers raised in CBCGA Chinese and TCGA white individuals.

a, b, Proportion of Luminal A (a) and HER2-enriched (b) breast cancer in the IDC cases of CBCGA Chinese (n = 716 tumors) and TCGA Asian (n = 47 tumors) compared with TCGA white individuals (n = 490 tumors) and METABRIC (n = 1974 tumors) cohorts. c–e, Gene-level somatic copy number alterations of the IDC cases in the CBCGA and TCGA white individuals cohorts grouped by IHC-based subtypes: amplifications (upper) and deletions (lower) in HR⁺HER2^- (c), HR-HER2+ (d) and triple-negative breast cancer (e). For a-b, P values were obtained from two-sided Fisher’s exact test and adjusted by the Benjamini–Hochberg procedure.

Source data

Extended Data Fig. 4 Quality control of proteomics and impact of copy number alteration on mRNA and protein expression.

a, Bar plot showing the detected genes in each batch. The totality of detected genes was 10864. b, Principal component analysis (PCA) evaluating the batch effect with all genes that were detected in over 70% of included samples after normalization and batch effect removement. c, Dot plots showing the Pearson’s correlation between technical replicates (samples within batch 33 and 34) with all genes that were detected in over 70% of included samples after normalization and batch effect removement. d, Venn diagrams depicting the cis-effect of CNA (FDR < 0.05) along the central dogma in this study and the studies published by Mertins and colleagues¹⁰ (n = 74 tumors) and by Krug and colleagues¹¹ (n = 122 tumors). e, f, Boxplot showing the mRNA level and protein level of WWP1 (e) and CCND1 (f) across different GISTIC scores in each PAM50 subtype. For WWP1 analysis, the number of samples were as follows: LumA: n = 188 tumors in the RNA analysis and n = 52 tumors in the protein analysis; LumB: n = 198 tumors in the RNA analysis and n = 73 tumors in the protein analysis; HER2: n = 121 tumors in the RNA analysis and n = 49 tumors in the protein analysis; Basal: n = 88 tumors in the RNA analysis and n = 44 tumors in the protein analysis; Normal: n = 47 tumors in the RNA analysis and n = 20 tumors in the protein analysis. For CCND1 analysis, the number of samples were as follows: LumA: n = 147 tumors in the RNA analysis and n = 41 tumors in the protein analysis; LumB: n = 163 tumors in the RNA analysis and n = 58 tumors in the protein analysis; HER2: n = 105 tumors in the RNA analysis and n = 41 tumors in the protein analysis; Basal: n = 76 tumors in the RNA analysis and n = 31 tumors in the protein analysis; Normal: n = 37 tumors in the RNA analysis and n = 12 tumors in the protein analysis. In boxplots, the centreline represents the median, the box limits represent the upper and lower quartiles, the whiskers represent the 1.5× interquartile range, and the points represent individual samples. g, h, Forest plot of multivariate Cox regression analysis for relapse free survival adjusting for PAM50 clusters, tumor size and lymph node status in overall population (n = 271 tumors) (g) and HR⁺HER2^- subgroup (n = 148 tumors) (h). Error bars represent the 95% confidence intervals (CI) of the hazard ratio (HR) and the center for the error bars indicates HRs. i, Gene set enrichment analysis (GSEA) comparing the molecular characteristics of each integrated cluster with the others. Pathways that were significantly enriched in certain cluster (FDR < 0.25) were shown. j, Heat map showing the abundance of immune cells in Cluster 3 (n = 75 tumors) and non-Cluster 3 (n = 196 tumors) breast cancers. Cell types that were significantly elevated in Cluster 3 subgroup were marked with asterisks. k, Enrichment of immunotherapy predictive signatures in integrated clusters and PAM50 subtypes indicated by logistic model in overall population (n = 271 tumors) and HR⁺HER2^- (n = 148 tumors) subgroups. For d, P values were obtained from Spearman’s rank test with false discovery rate correction. For e, f, two-sided Wilcoxon rank tests were conducted to compare the mRNA level or protein level between samples with GISTIC scores of ‘0’ and ‘2’ in different PAM50 subtypes. *: P value < 0.05; N.S.: not significant, P value > 0.05. For g, h, P values were obtained from two-sided multivariate Cox regression analysis. The bold font indicates a P value less than 0.05. For j, P values were obtained from unpaired two-sided t-test.

Source data

Extended Data Fig. 5 Quality control and overview of polar metabolomic and lipidomic data in CBCGA.

a, The distribution of quality control (QC) samples in principal component analysis (PCA) of polar metabolomic data in positive- (left panel) and negative- (right panel) ion modes. b, The distribution of QC samples in PCA of lipidomic data in positive- (left panel) and negative- (right panel) ion modes. c, The numbers and proportions of annotated polar metabolites (top panel) and lipids (bottom panel) in our study. FA, Fatty Acid; GL, Glycerolipid; GP, Glycerophospholipid; SP, Sphingolipid; ST, Sterol Lipids. d, A volcano plot of the 669 annotated polar metabolites (top panel) and 1312 lipids (bottom panel) profiled. Differentially abundant metabolites of different categories were individually color coded. e, Log₂ fold change (FC) of different categories of polar metabolites (top panel) and lipids (bottom panel) between tumor and normal tissues. The dashed red line represents the same level of metabolite abundance between the tumor and the normal. Tumor, n = 501 biologically independent samples; Normal, n = 76 biologically independent samples. Center line indicates the median, and bounds of box indicate the 25th and 75th percentiles, the whiskers represent the 1.5× interquartile range. f, A pathway-based analysis of metabolomic changes between tumor and normal tissues. The differential abundance (DA) score captures the average, gross changes for all metabolites in a pathway. A score of 1 indicates that all measured metabolites in the pathway increase in the tumor compared to normal tissues, and a score of −1 indicates that all measured metabolites in a pathway decrease. Pathways with no less than three measured metabolites were used for DA score calculation. Tumor, n = 501 biologically independent samples; Normal, n = 76 biologically independent samples. For d, P values are calculated using the two-sided Kruskal–Wallis test and adjusted by the Benjamini–Hochberg procedure.

Source data

Extended Data Fig. 6 Integrated analysis of immunogenomic characteristics of breast cancer.

a, CIBERSORT estimated cell proportion of 22 types of immune cells among TME phenotypes (Cold: n = 296 tumors; Moderate: n = 191 tumors; Hot: n = 265 tumors). Cell abundance was normalized across samples. b, ESTIMATE evaluated immune and stromal signatures among different TME phenotypes in each PAM50 subtype (LumA: n = 222 tumors; LumB: n = 221 tumors; HER2: n = 148 tumors; Basal: n = 112 tumors; Normal: n = 49 tumors). For the boxplot, center line indicates the median value, lower and upper hinges represent the 25th and 75th percentiles, respectively and whiskers denote 1.5 × interquartile range. c, K-means clustering of TCGA cohort based on the estimated abundance of 24 microenvironment cell types (Cold: n = 419 tumors; Moderate: n = 458 tumors; Hot: n = 202 tumors). d, Distribution of TME phenotypes across the PAM50 subtypes in TCGA cohort. e, Proportions of tumor microenvironment cells deconvoluted from scRNA-seq data (n = 752 tumors). f, g, Comparison of MHC (f) and innate immune (g) molecules expression among TME phenotypes in each indicated PAM50 subtype (n = 752 tumors). h, Comparison of virus mimicry signature among TME phenotypes in each indicated intrinsic subtype (LumA: n = 222 tumors; LumB: n = 221 tumors; HER2: n = 148 tumors; Basal: n = 112 tumors; Normal: n = 49 tumors). Center line indicates the median value, lower and upper hinges represent the 25th and 75th percentiles, respectively and whiskers denote 1.5 × interquartile range. For b,h, P values are calculated using the two-sided Kruskal–Wallis test adjusted by Benjamini–Hochberg (BH) procedure.

Source data

Extended Data Fig. 7 Recurrent ERBB2 fusion transcripts in HER2-positive tumors.

a, Distribution of fusion genes across chromosomes. b, The circle represents the landscape of fusion genes. Recurrent fusions (more than two samples) are displayed as connected gene pairs, in which the width of the connecting arc represents the number of samples that contained the fusion. Red indicates novel gene fusions not present in public database (FusionGDB and ChimerDB). c, Bar chart showing the top 11 recurrent fusion genes. d, e, Distribution of fusion genes in IHC subtypes (d) (HR⁺HER2^-, n = 468 tumors; HR⁺HER2⁺ , n = 100 tumors; HR-HER2 + , n = 81 tumors; TNBC, n = 103 tumors; Paratumour, n = 60 samples) and PAM50 subtypes (e) (Luminal A, n = 222 tumors; Luminal B, n = 221 tumors; HER2-enriched, n = 148 tumors; Basal-like, n = 112 tumors; Normal-like, n = 49 tumors; Paratumour, n = 60 samples). For the boxplot, center line indicates the median value, lower and upper hinges represent the 25th and 75th percentiles, respectively and whiskers denote 1.5 × interquartile range. f, The proportions of fusion types proximal to ERBB2 on chromosome 17q. g, Circos plot displaying ERBB2 fusions. h, Propensity-matched survival analysis for HER2-positive patients with or without ERBB2 fusions. For d, e, the statistical analysis was performed using the Kruskal–Wallis test. For h, survival distributions were compared using the log-rank test.

Source data

Extended Data Fig. 8 Data dimension, overall performance multimodal prognosis prediction model and feature importance of TMPIC model.

a, Upset plot showing the number of patients of different data modality combinations. Vertical bars of upper plot present the number of patients of data modality combinations denoted by the black circles of the plot located below. C, clinical stage; I, IHC subtype; T, transcriptomic data; P, digital pathology data; M, metabolomic data; R, radiologic data. b, Comparison of C-indices of models of single modalities (n = 6 models), of 2 to 3 modalities (n = 15 models) and of 4 to 6 modalities (n = 16 models). For the boxplot, center line indicates the median value, lower and upper hinges represent the 25th and 75th percentiles, respectively and whiskers denote 1.5 × interquartile range. FDR, false discovery rate. c, Feature importance score of TMPIC model. New C-indices were calculated as dropping each individual feature from the TMPIC model. Feature importance score calculated as the difference of original C-index and new C-index in the testing cohort (n = 80 patients). For b, P values were obtained from the Kruskal–Wallis test with false discovery rate correction.

Source data

Supplementary information

Reporting Summary

Supplementary Table 1

a, Clinical and molecular characteristics of the involved patients. b, Mutational signatures contribution per intrinsic subtype. c, Frequent somatic mutations and germline variants shown in Fig. 1. d, Frequent cancer-related copy number gain/amplification between different intrinsic subtypes. e, Frequent cancer-related copy number loss/deletion between different intrinsic subtypes. f, Transcriptome data shown in Fig. 1. g, Differentially expressed proteins across intrinsic subtypes. h, Differentially expressed polar metabolites across intrinsic subtypes. i, Differentially expressed lipids across intrinsic subtypes.

Supplementary Table 2

a, Clinical features and molecular subtypes between CBCGA and TCGA white individuals. b, Frequent mutations between CBCGA and TCGA white individuals (IDC). c, Intrinsic subtypes between CBCGA and TCGA white individuals (IDC). d, Enriched copy number amplifications between CBCGA and TCGA white individuals (IDC). e, Enriched copy number deletions between CBCGA and TCGA white individuals (IDC).

Supplementary Table 3

Effects of CNAs on mRNA and protein (P values were calculated using the two-sided Spearman’s rank test and were adjusted for multiple testing using the FDR method).

Supplementary Table 4

a, Additional samples. Supplementary information of the additional 58 TNBC samples for metabolomic detection. b, Polar metabolites. log₂ transformed abundance of MS2 annotated polar metabolites in tumor and normal tissues of the CBCGA cohort. c, Lipids. log₂ transformed abundance of MS2 annotated lipids in tumor and healthy tissues of the CBCGA cohort. d, Protein network. Protein annotations of metabolic protein network. e, Metabolite network. Polar metabolite annotations of metabolite network. f, Correlations. Correlation of subtype-specific metabolic proteins and subtype-specific polar metabolites.

Supplementary Table 5

a, Single-sample GSEA estimated abundance of tumor microenvironment cells. b, CIBERSORT estimated proportion of tumor microenvironment cells. c, scRNA deconvolution. Deconvoluted proportion of tumor microenvironment cells based on scRNA-seq data. d, Immunogenomic indicators of the cohort. e, Somatic mutations of each TME phenotypes. f, Copy-number alterations of each TME phenotypes.

Supplementary Table 6

a, List of fusion events in CBCGA cohort. b, The reading frame of fusion transcripts in CBCGA cohort.

Supplementary Table 7

a, Features for multimodal integration. b, C-indices of models combining multimodal features to stratify patient prognosis in the testing cohort. c, Risk scores for each patient and values of multimodal features used in the TMPIC model.

Source data

Source Data Fig. 1

Statistical Source Data.

Source Data Fig. 2

Statistical Source Data.

Source Data Fig. 3

Statistical Source Data.

Source Data Fig. 4

Statistical Source Data.

Source Data Fig. 5

Statistical Source Data.

Source Data Fig. 6

Statistical Source Data.

Source Data Extended Data Fig. 1

Statistical Source Data.

Source Data Extended Data Fig. 2

Statistical Source Data.

Source Data Extended Data Fig. 3

Statistical Source Data.

Source Data Extended Data Fig. 4

Statistical Source Data.

Source Data Extended Data Fig. 5

Statistical Source Data.

Source Data Extended Data Fig. 6

Statistical Source Data.

Source Data Extended Data Fig. 7

Statistical Source Data.

Source Data Extended Data Fig. 8

Statistical Source Data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jiang, YZ., Ma, D., Jin, X. et al. Integrated multiomic profiling of breast cancer in the Chinese population reveals patient stratification and therapeutic vulnerabilities. Nat Cancer 5, 673–690 (2024). https://doi.org/10.1038/s43018-024-00725-0

Download citation

Received: 02 December 2022
Accepted: 04 January 2024
Published: 12 February 2024
Issue Date: April 2024
DOI: https://doi.org/10.1038/s43018-024-00725-0

This article is cited by

Multicenter radio-multiomic analysis for predicting breast cancer outcome and unravelling imaging-biological connection
- Chao You
- Guan-Hua Su
- Ya-Jia Gu
npj Precision Oncology (2024)
Epoxy metabolites of linoleic acid promote the development of breast cancer via orchestrating PLEC/NFκB1/CXCL9-mediated tumor growth and metastasis
- Kai-Di Ni
- Xian Fu
- Jun-Yan Liu
Cell Death & Disease (2024)

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links