Abstract
Accurately classifying bladder cancer patients based on Tumor Mutational Burden (TMB) is of paramount significance for prognosis and treatment decisions. To achieve that, we present a novel approach leveraging multi-omics data to differentiate between low and high TMB classes. The model combines feature selection and predictive modeling to unveil robust biomarkers associated with TMB classification. The Genetic Algorithm is employed to perform feature selection across DNA methylation, copy number alteration, and RNA-seq datasets. This process effectively reduces the dimensionality of the input data while retaining the most informative attributes. Subsequently, these selected features are projected into a latent space using non-negative matrix factorization, capturing the underlying patterns within the multi-omics data. Convolutional neural network among other machine learning machines to predict the class of TMB. The model introduces a promising classification power, showcasing the potential of these multi-omics biomarkers in accurately distinguishing between low and high TMB classes. The survival analysis reveals a substantial disparity between the cohorts classified as low-TMB and high-TMB. We propose a robust framework for TMB classification in bladder cancer that integrates multi-omics data, advanced machine learning techniques, and survival analysis to collectively pave the way for improved prognostic insights and personalized therapeutic interventions.
Similar content being viewed by others
Data availability
The dataset analysed during the current study is available in the cBioPortal repository at https://www.cbioportal.org/study/summary?id=blca_tcga_pan_can_atlas_2018.
References
Alshomali L, Khorma R, Al-Refai A, Alkhateeb A (2023) Establishing a correlation of clinical characteristics with the level of tumor mutation burden in urothelial bladder carcinoma. In: 2023 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 3998–4001
Antoni S, Ferlay J, Soerjomataram I, Znaor A, Jemal A, Bray F (2016) Bladder cancer incidence and mortality: a global overview and recent trends. Eur Urol 71:96–108
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol) 57:289–300
Bladder cancer (2022) Mayo Clinic. https://www.mayoclinic.org/diseases-conditions/bladder-cancer/symptoms-causes/syc-20356104. Acessed 21 June 2023
Bladder cancer statistics: World cancer research fund international (2022) WCRF International. https://www.wcrf.org/cancer-trends/bladder-cancer-statistics/. Accessed 21 June 2023
Breiman L (2001) Random Forests. Mach Learn 45:5–32
Cerami E, Gao J, Dogrusoz U et al (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2:401–4
Chang K, Creighton CJ, Davis C et al (2013) The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45:1113–20
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794
Dobruch J, Daneshmand S, Fisch M et al (2016) Gender and bladder cancer: a collaborative review of etiology, biology, and outcomes. Eur Urol 69:300–10
Elkarami B, Alkhateeb A, Rueda L (2016) Cost-sensitive classification on class-balanced ensembles for imbalanced non-coding RNA data. In: IEEE EMBS international student conference (ISC). IEEE, pp 1–4
Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. arXiv preprint arXiv:cs/0102027
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–88
Gao J, Aksoy BA, Dogrusoz U et al (2013) Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 6:l1
Ge SX, Jung D, Yao R (2019) ShinyGO: a graphical gene-set enrichment tool for animals and plants. Bioinformatics 36:2628–9
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading
Hinton GE, Zemel R (1993) Autoencoders, minimum description length and Helmholtz free energy. Adv Neural Inf Process Syst 6
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT Press, Oxford
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Lawlor RT, Mattiolo P, Mafficini A et al (2021) Tumor mutational burden as a potential biomarker for immunotherapy in pancreatic cancer: systematic review and still-open questions. Cancers (Basel) 13:3119
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–91
Lee D, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Leen T, Dietterich T, Tresp V (eds) Advances in neural information processing systems. MIT Press. https://proceedings.neurips.cc/paper_files/paper/2000/file/f9d1152547c0bde01830b7e8bd60024c-Paper.pdf, p 13
Marcus L, Fashoyin-Aje LA, Donoghue M et al (2021) FDA approval summary: pembrolizumab for the treatment of tumor mutational burden-high solid tumors. Clin Cancer Res 27:4685–9
Min S, Lee B, Yoon S (2016) Deep learning in bioinformatics. Brief Bioinform 18:851–69
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–14
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2:559–72
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–30
Ramalingam S, Hellmann M, Awad M et al (2018) Tumor mutational burden (TMB) as a biomarker for clinical benefit from dual immune checkpoint blockade with nivolumab (nivo)+ ipilimumab (ipi) in first-line (1L) non-small cell lung cancer (NSCLC): identification of TMB cutoff from CheckMate 568. Cancer Res 78:CT078–CT078
Sha D, Jin Z, Budczies J, Kluck K, Stenzinger A, Sinicrope FA (2020) Tumor mutational burden as a predictive biomarker in solid tumors. Cancer Discov 10:1808–25
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–58
Tang X, Qian WL, Yan WF, Pang T, Gong YL, Yang ZG (2021) Radiomic assessment as a method for predicting tumor mutation burden (TMB) of bladder cancer patients: a feasibility study. BMC Cancer 21:1–9
Thomas PD, Ebert D, Muruganujan A, Mushayahama T, Albou LP, Mi H (2022) PANTHER: making genome-scale phylogenetics accessible to all. Protein Sci 31:8–22
Yang J, Shi W, Yang Z et al (2023) Establishing a predictive model for tumor mutation burden status based on CT radiomics and clinical features of non-small cell lung cancer patients. Transl Lung Cancer Res 12:808–23
Zhang X, Wang J, Lu J et al (2021) Robust prognostic subtyping of muscle-invasive bladder cancer revealed by deep learning-based multi-omics data integration. Front Oncol 11:689626
Funding
This research was funded by the Scientific Research and Innovation Support Fund/Ministry of Higher Education and Scientific Research/Jordan, Grant number (ICT/1/16/2022). The recipients of this fund are Abedalrhman Alkhateeb and Hazem Qattous.
Author information
Authors and Affiliations
Contributions
Conceptualization, AA, SA, and MA; Data curation, IA, and NA; Formal analysis, IA, LA, HQ, and AA; Funding acquisition, AA; Investigation, AA, LA, and SA; Methodology, IA, NA, AA, and MA; Project administration, AA, MA.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Al-Ghafer, I.A., AlAfeshat, N., Alshomali, L. et al. NMF-guided feature selection and genetic algorithm-driven framework for tumor mutational burden classification in bladder cancer using multi-omics data. Netw Model Anal Health Inform Bioinforma 13, 26 (2024). https://doi.org/10.1007/s13721-024-00460-7
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-024-00460-7