Identification of Novel Diagnostic and Prognostic Gene Signature Biomarkers for Breast Cancer Using Artificial Intelligence and Machine Learning Assisted Transcriptomics Analysis
<p>Boxplot showing the expression distribution for dataset GSE7904. (<b>A</b>) Raw (un-normalized) expression distribution with log2 scale in the range of −200 to 400. (<b>B</b>) Normalized intensities showing almost similar distributions of expression intensities, with the log2 scale in the range of 0 to 12.</p> "> Figure 2
<p>Volcano plot showing differentially expressed genes: (i) the majority were non-significant (black), (ii) upregulated DEGs (red), and (iii) downregulated DEGs (blue).</p> "> Figure 3
<p>Canonical pathways derived using the IPA tool. (<b>A</b>) Kinetochore metaphase signaling pathway, (<b>B</b>) PTEN pathway overlapped with breast cancer associated genes.</p> "> Figure 4
<p>(<b>A</b>) Venn diagram showing 28 hub genes derived from the intersection of DEGs > 3 ML and DEGs > 4 datasets. (<b>B</b>) Unsupervised hierarchical clustering: heatmap of 701 samples, including 356 breast tumor (BT, cyan) and 345 normal breast (NB, pink) tissues, showing the gene expression pattern of 28 hub genes, including diagnostic and prognostic gene signatures. Upregulated genes are shown in red and downregulated genes are in blue.</p> "> Figure 5
<p>K-nearest neighbors (KNN)-based ML model for diagnostic gene signature showing the mean ROC (AUC 0.989 ± 0.013).</p> "> Figure 6
<p>PCA plot showing an overall distribution of the samples (n = 701), including breast tumor (blue) and normal breast tissue (red) based on transcriptomics profiles: (<b>A</b>) 54,675 probes, (<b>B</b>) 355 DEGs, (<b>C</b>) 28 hub genes, and (<b>D</b>) diagnostic nine-gene signature.</p> "> Figure 7
<p>KM plot based on the relapse-free survival analysis of eight individual genes (mRNA, gene-chip) of prognostic gene signature. The <span class="html-italic">X</span>-axis and <span class="html-italic">Y</span>-axis represent time in months and the probability of the survival of patients, respectively. The impact of the high and low expression of the gene on patient survival is shown in red and black lines, respectively.</p> "> Figure 8
<p>KM plot based on the overall survival analysis of eight individual genes (mRNA, gene-chip) of prognostic gene signature. The <span class="html-italic">X</span>-axis and <span class="html-italic">Y</span>-axis represent time in months and the probability of the survival of patients, respectively. The impact of the high and low expression of the gene on patient survival is shown in red and black lines, respectively.</p> "> Figure 9
<p>RFS and OS analyses and the validation of upregulated (<span class="html-italic">CCNE2, NUSAP1, TPX2</span>, and <span class="html-italic">S100P</span>), and downregulated (<span class="html-italic">ITM2A, LIFR, TNXA,</span> and <span class="html-italic">ZBTB16</span>) gene groups (mRNA, RNA seq) of the prognostic gene signature. The <span class="html-italic">X</span>-axis and <span class="html-italic">Y</span>-axis represent time in months and the probability of the survival of patients. The impact of the high and low expression of the gene on patient survival is shown in red and black lines, respectively.</p> "> Figure 10
<p>Gradient-boosting decision trees (GBDT) based on the ML model for the prognostic gene signature showing the mean ROC (AUC 0.993 ± 0.006).</p> "> Figure 11
<p>qRT-PCR results showing overexpression of <span class="html-italic">COL10A</span>, <span class="html-italic">S100P</span>, <span class="html-italic">WISP1</span>, <span class="html-italic">COMP</span>, <span class="html-italic">CXCL10</span>, <span class="html-italic">COL11A1</span>, <span class="html-italic">INHBA</span>; <span class="html-italic">CCNE2</span>, <span class="html-italic">NUSAP1</span>, <span class="html-italic">TPX2</span>, and <span class="html-italic">S100P</span> genes, and under-expression of <span class="html-italic">ADAMTS5</span>, <span class="html-italic">LYVE1</span>, <span class="html-italic">ITM2A</span>, <span class="html-italic">LIFR</span>, <span class="html-italic">TNXA</span>, and <span class="html-italic">ZBTB16</span> genes.</p> ">
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Sets and Patients
2.2. Preprocessing and Differential Expression Analysis
2.3. Functional Pathway and Gene Set Enrichment Analysis
2.4. Machine Learning and Feature Selection Methods
2.4.1. RFECV with Logistic Regression or with SVM
2.4.2. LASSO Regularization (L1) Using Logistic Regression or Support Vector Classification
2.4.3. Random Forest
2.4.4. Extra Trees Classifier
2.4.5. Genetic Algorithm
2.4.6. XGBoost
2.4.7. GBDT
2.4.8. MLP
2.4.9. AdaBoost
2.4.10. KNN
2.5. Survival Analysis Using the Kaplan–Meier Estimator
2.6. RNA Isolation and qRT-PCR
2.7. Statistical Analysis
3. Results
3.1. Differentially Expressed Genes in BC
3.2. Function Pathway Analysis and Network Enrichment Analysis
3.3. Machine Learning Algorithms for the Identification of Diagnostic Biomarker Genes
3.4. Machine-Learning-Algorithm-Based 10-Fold Cross-Validation
3.5. Survival Analysis to Identify Genes with Prognostic Importance
3.6. qRT-PCR Analysis
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Karim, S.; Al-Kharraz, M.; Mirza, Z.; Noureldin, H.; Abusamara, H.; Alganmi, N.; Merdad, A.; Jastaniah, S.; Kumar, S.; Rasool, M.; et al. Development of “Biosearch System” for biobank management and storage of disease associated genetic information. J. King Saud Univ.—Sci. 2022, 34, 101760. [Google Scholar] [CrossRef]
- Ramaswamy, S.; Tamayo, P.; Rifkin, R.; Mukherjee, S.; Yeang, C.H.; Angelo, M.; Ladd, C.; Reich, M.; Latulippe, E.; Mesirov, J.P.; et al. Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. USA 2001, 98, 15149–15154. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Arnold, M.; Morgan, E.; Rumgay, H.; Mafra, A.; Singh, D.; Laversanne, M.; Vignat, J.; Gralow, J.R.; Cardoso, F.; Siesling, S.; et al. Current and future burden of breast cancer: Global statistics for 2020 and 2040. Breast 2022, 66, 15–23. [Google Scholar] [CrossRef] [PubMed]
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
- Khan, A.; Rehman, Z.; Hashmi, H.F.; Khan, A.A.; Junaid, M.; Sayaf, A.M.; Ali, S.S.; Hassan, F.U.; Heng, W.; Wei, D.Q. An Integrated Systems Biology and Network-Based Approaches to Identify Novel Biomarkers in Breast Cancer Cell Lines Using Gene Expression Data. Interdiscip. Sci. 2020, 12, 155–168. [Google Scholar] [CrossRef] [PubMed]
- Abd-Elnaby, M.; Alfonse, M.; Roushdy, M. Classification of breast cancer using microarray gene expression data: A survey. J. Biomed. Inform. 2021, 117, 103764. [Google Scholar] [CrossRef]
- Makary, M.A.; Daniel, M. Medical error—The third leading cause of death in the US. BMJ 2016, 353, i2139. [Google Scholar] [CrossRef]
- Karim, S.; Iqbal, M.S.; Ahmad, N.; Ansari, M.S.; Mirza, Z.; Merdad, A.; Jastaniah, S.D.; Kumar, S. Gene expression study of breast cancer using Welch Satterthwaite t-test, Kaplan-Meier estimator plot and Huber loss robust regression model. J. King Saud Univ.—Sci. 2023, 35, 102447. [Google Scholar] [CrossRef]
- Schena, M.; Shalon, D.; Davis, R.W.; Brown, P.O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995, 270, 467–470. [Google Scholar] [CrossRef] [Green Version]
- Qing, T.; Karn, T.; Rozenblit, M.; Foldi, J.; Marczyk, M.; Shan, N.L.; Blenman, K.; Holtrich, U.; Kalinsky, K.; Meric-Bernstam, F.; et al. Molecular differences between younger versus older ER-positive and HER2-negative breast cancers. NPJ Breast Cancer 2022, 8, 119. [Google Scholar] [CrossRef]
- Karim, S.; Merdad, A.; Schulten, H.J.; Jayapal, M.; Dallol, A.; Buhmeida, A.; Al-Thubaity, F.; Mirza, Z.; Gari, M.A.; Chaudhary, A.G.; et al. Low expression of leptin and its association with breast cancer: A transcriptomic study. Oncol. Rep. 2016, 36, 43–48. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Merdad, A.; Karim, S.; Schulten, H.J.; Dallol, A.; Buhmeida, A.; Al-Thubaity, F.; Gari, M.A.; Chaudhary, A.G.; Abuzenadah, A.M.; Al-Qahtani, M.H. Expression of matrix metalloproteinases (MMPs) in primary human breast cancer: MMP-9 as a potential biomarker for cancer invasion and metastasis. Anticancer Res. 2014, 34, 1355–1366. [Google Scholar] [PubMed]
- van de Vijver, M.J.; He, Y.D.; van’t Veer, L.J.; Dai, H.; Hart, A.A.; Voskuil, D.W.; Schreiber, G.J.; Peterse, J.L.; Roberts, C.; Marton, M.J.; et al. A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 2002, 347, 1999–2009. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Schulten, H.J.; Bangash, M.; Karim, S.; Dallol, A.; Hussein, D.; Merdad, A.; Al-Thoubaity, F.K.; Al-Maghrabi, J.; Jamal, A.; Al-Ghamdi, F.; et al. Comprehensive molecular biomarker identification in breast cancer brain metastases. J. Transl. Med. 2017, 15, 269. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Perou, C.M.; Sørlie, T.; Eisen, M.B.; van de Rijn, M.; Jeffrey, S.S.; Rees, C.A.; Pollack, J.R.; Ross, D.T.; Johnsen, H.; Akslen, L.A.; et al. Molecular portraits of human breast tumours. Nature 2000, 406, 747–752. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Khan, J.; Wei, J.S.; Ringnér, M.; Saal, L.H.; Ladanyi, M.; Westermann, F.; Berthold, F.; Schwab, M.; Antonescu, C.R.; Peterson, C.; et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 2001, 7, 673–679. [Google Scholar] [CrossRef]
- The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 2012, 490, 61–70. [Google Scholar] [CrossRef] [Green Version]
- Slodkowska, E.A.; Ross, J.S. MammaPrint 70-gene signature: Another milestone in personalized medical care for breast cancer patients. Expert Rev. Mol. Diagn. 2009, 9, 417–422. [Google Scholar] [CrossRef]
- van ‘t Veer, L.J.; Dai, H.; van de Vijver, M.J.; He, Y.D.; Hart, A.A.; Mao, M.; Peterse, H.L.; van der Kooy, K.; Marton, M.J.; Witteveen, A.T.; et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002, 415, 530–536. [Google Scholar] [CrossRef] [Green Version]
- Qian, Y.; Daza, J.; Itzel, T.; Betge, J.; Zhan, T.; Marmé, F.; Teufel, A. Prognostic Cancer Gene Expression Signatures: Current Status and Challenges. Cells 2021, 10, 648. [Google Scholar] [CrossRef]
- Golub, T.R.; Slonim, D.K.; Tamayo, P.; Huard, C.; Gaasenbeek, M.; Mesirov, J.P.; Coller, H.; Loh, M.L.; Downing, J.R.; Caligiuri, M.A.; et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 1999, 286, 531–537. [Google Scholar] [CrossRef] [Green Version]
- Sotiriou, C.; Pusztai, L. Gene-expression signatures in breast cancer. N. Engl. J. Med. 2009, 360, 790–800. [Google Scholar] [CrossRef] [Green Version]
- Smyth, G.K. limma: Linear Models for Microarray Data. In Bioinformatics and Computational Biology Solutions Using R and Bioconductor; Gentleman, R., Carey, V.J., Huber, W., Irizarry, R.A., Dudoit, S., Eds.; Springer: New York, NY, USA, 2005; pp. 397–420. [Google Scholar]
- Wang, G.; Muschelli, J.; Lindquist, M.A. Moderated t-tests for group-level fMRI analysis. NeuroImage 2021, 237, 118141. [Google Scholar] [CrossRef]
- Sherman, B.T.; Hao, M.; Qiu, J.; Jiao, X.; Baseler, M.W.; Lane, H.C.; Imamichi, T.; Chang, W. DAVID: A web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Res. 2022, 50, W216–W221. [Google Scholar] [CrossRef]
- Liao, Y.; Wang, J.; Jaehnig, E.J.; Shi, Z.; Zhang, B. WebGestalt 2019: Gene set analysis toolkit with revamped UIs and APIs. Nucleic Acids Res. 2019, 47, W199–W205. [Google Scholar] [CrossRef] [Green Version]
- Tibshirani, R. Regression Shrinkage and Selection via the Lasso. J. R. Stat. Soc. Ser. B (Methodol.) 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef] [Green Version]
- Goldberg, D.E.; Holland, J.H. Genetic Algorithms and Machine Learning. Mach. Learn. 1988, 3, 95–99. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
- Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
- Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef] [Green Version]
- Iqbal, M.S.; Ahmad, N.; Mirza, Z.; Karim, S. Gene expression and survival analysis study of KIAA0101 gene revealed its prognostic and diagnostic importance in breast cancer. Vegetos 2023, 36, 249–258. [Google Scholar] [CrossRef]
- Lánczky, A.; Győrffy, B. Web-Based Survival Analysis Tool Tailored for Medical Research (KMplot): Development and Implementation. J. Med. Internet Res. 2021, 23, e27633. [Google Scholar] [CrossRef] [PubMed]
- Nicolini, A.; Ferrari, P.; Duffy, M.J. Prognostic and predictive biomarkers in breast cancer: Past, present and future. Semin. Cancer Biol. 2018, 52, 56–73. [Google Scholar] [CrossRef]
- Nair, M.; Sandhu, S.S.; Sharma, A.K. Cancer molecular markers: A guide to cancer detection and management. Semin. Cancer Biol. 2018, 52, 39–55. [Google Scholar] [CrossRef]
- Senkus, E.; Kyriakides, S.; Ohno, S.; Penault-Llorca, F.; Poortmans, P.; Rutgers, E.; Zackrisson, S.; Cardoso, F. Primary breast cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann. Oncol. 2015, 26 (Suppl. 5), v8–v30. [Google Scholar] [CrossRef]
- Chibon, F. Cancer gene expression signatures—The rise and fall? Eur. J. Cancer 2013, 49, 2000–2009. [Google Scholar] [CrossRef]
- Kanathezath, A.; Chembra, V.; Padingare Variyath, K.S.; Nair, G.G. Identification of Biomarkers and Functional Modules from Genomic Data in Stage-wise Breast Cancer. Curr. Bioinform. 2021, 16, 722–733. [Google Scholar] [CrossRef]
- Zhang, S.; Jiang, H.; Gao, B.; Yang, W.; Wang, G. Identification of Diagnostic Markers for Breast Cancer Based on Differential Gene Expression and Pathway Network. Front. Cell Dev. Biol. 2021, 9, 811585. [Google Scholar] [CrossRef]
- Bao, S.; He, G. Identification of Key Genes and Key Pathways in Breast Cancer Based on Machine Learning. Med. Sci. Monit. 2022, 28, e935515. [Google Scholar] [CrossRef]
- Dehdar, S.; Salimifard, K.; Mohammadi, R.; Marzban, M.; Saadatmand, S.; Fararouei, M.; Dianati-Nasab, M. Applications of different machine learning approaches in prediction of breast cancer diagnosis delay. Front. Oncol. 2023, 13, 1103369. [Google Scholar] [CrossRef]
- Deng, J.-L.; Xu, Y.-h.; Wang, G. Identification of Potential Crucial Genes and Key Pathways in Breast Cancer Using Bioinformatic Analysis. Front. Genet. 2019, 10, 695. [Google Scholar] [CrossRef] [Green Version]
- Joglekar, A.P.; Kukreja, A.A. How Kinetochore Architecture Shapes the Mechanisms of Its Function. Curr. Biol. 2017, 27, R816–R824. [Google Scholar] [CrossRef]
- Cairo, G.; Lacefield, S. Establishing correct kinetochore-microtubule attachments in mitosis and meiosis. Essays Biochem. 2020, 64, 277–287. [Google Scholar]
- Su, T.; Qin, X.-Y.; Dohmae, N.; Wei, F.; Furutani, Y.; Kojima, S.; Yu, W. Inhibition of Ganglioside Synthesis Suppressed Liver Cancer Cell Proliferation through Targeting Kinetochore Metaphase Signaling. Metabolites 2021, 11, 167. [Google Scholar] [CrossRef]
- Carnero, A.; Blanco-Aparicio, C.; Renner, O.; Link, W.; Leal, J.F. The PTEN/PI3K/AKT signalling pathway in cancer, therapeutic implications. Curr. Cancer Drug Targets 2008, 8, 187–198. [Google Scholar] [CrossRef]
- Carnero, A.; Paramio, J.M. The PTEN/PI3K/AKT Pathway in vivo, Cancer Mouse Models. Front. Oncol. 2014, 4, 252. [Google Scholar] [CrossRef] [Green Version]
- Georgescu, M.M. PTEN Tumor Suppressor Network in PI3K-Akt Pathway Control. Genes Cancer 2010, 1, 1170–1177. [Google Scholar] [CrossRef]
- Zhang, H. Molecular signaling and genetic pathways of senescence: Its role in tumorigenesis and aging. J. Cell. Physiol. 2007, 210, 567–574. [Google Scholar] [CrossRef] [PubMed]
- Rayess, H.; Wang, M.B.; Srivatsan, E.S. Cellular senescence and tumor suppressor gene p16. Int. J. Cancer 2012, 130, 1715–1725. [Google Scholar] [CrossRef] [Green Version]
- Bernardes de Jesus, B.; Blasco, M.A. Telomerase at the intersection of cancer and aging. Trends Genet. 2013, 29, 513–520. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ou, H.-L.; Hoffmann, R.; González-López, C.; Doherty, G.J.; Korkola, J.E.; Muñoz-Espín, D. Cellular senescence in cancer: From mechanisms to detection. Mol. Oncol. 2021, 15, 2634–2671. [Google Scholar] [CrossRef]
- Sarkisian, C.J.; Keister, B.A.; Stairs, D.B.; Boxer, R.B.; Moody, S.E.; Chodosh, L.A. Dose-dependent oncogene-induced senescence in vivo and its evasion during mammary tumorigenesis. Nat. Cell Biol. 2007, 9, 493–505. [Google Scholar] [CrossRef] [PubMed]
- Arena, P.; Basile, A.; Bucolo, M.; Fortuna, L. Image processing for medical diagnosis using CNN. Nucl. Instrum. Methods Phys. Res. Sect. A Accel. Spectrometers Detect. Assoc. Equip. 2003, 497, 174–178. [Google Scholar] [CrossRef]
- Speiser, J.L.; Miller, M.E.; Tooze, J.; Ip, E. A Comparison of Random Forest Variable Selection Methods for Classification Prediction Modeling. Expert Syst. Appl. 2019, 134, 93–101. [Google Scholar] [CrossRef]
- Chen, Y.; Zheng, W.; Li, W.; Huang, Y. Large group activity security risk assessment and risk early warning based on random forest algorithm. Pattern Recognit. Lett. 2021, 144, 1–5. [Google Scholar] [CrossRef]
- Lee, S.; Kwon, S.; Kim, Y. A modified local quadratic approximation algorithm for penalized optimization problems. Comput. Stat. Data Anal. 2016, 94, 275–286. [Google Scholar] [CrossRef]
- Koul, N.; Manvi, S.S. A Scheme for Feature Selection from Gene Expression Data using Recursive Feature Elimination with Cross Validation and Unsupervised Deep Belief Network Classifier. In Proceedings of the 2019 3rd International Conference on Computing and Communications Technologies (ICCCT), Chennai, India, 21–22 February 2019; pp. 31–36. [Google Scholar]
- Brownlee, J. Deep Learning with Time Series Forecasting; Machine Learning Mastery: San Juan, PR, USA, 2020; Volume 2023. [Google Scholar]
- Ranstam, J.; Cook, J.A. LASSO regression. Br. J. Surg. 2018, 105, 1348. [Google Scholar] [CrossRef]
- McEligot, A.J.; Poynor, V.; Sharma, R.; Panangadan, A. Logistic LASSO Regression for Dietary Intakes and Breast Cancer. Nutrients 2020, 12, 2652. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R.; Friedman, J. Support Vector Machines and Flexible Discriminants. In The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009; pp. 417–458. [Google Scholar]
- Katoch, S.; Chauhan, S.S.; Kumar, V. A review on genetic algorithm: Past, present, and future. Multimed. Tools Appl. 2021, 80, 8091–8126. [Google Scholar] [CrossRef]
- Puppe, J.; Seifert, T.; Eichler, C.; Pilch, H.; Mallmann, P.; Malter, W. Genomic Signatures in Luminal Breast Cancer. Breast Care 2020, 15, 355–365. [Google Scholar] [CrossRef]
- Varnier, R.; Sajous, C.; de Talhouet, S.; Smentek, C.; Péron, J.; You, B.; Reverdy, T.; Freyer, G. Using Breast Cancer Gene Expression Signatures in Clinical Practice: Unsolved Issues, Ongoing Trials and Future Perspectives. Cancers 2021, 13, 4840. [Google Scholar] [CrossRef]
- Nasser, M.; Yusof, U.K. Deep Learning Based Methods for Breast Cancer Diagnosis: A Systematic Review and Future Direction. Diagnostics 2023, 13, 161. [Google Scholar] [CrossRef]
- Thalor, A.; Kumar Joon, H.; Singh, G.; Roy, S.; Gupta, D. Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer. Comput. Struct. Biotechnol. J. 2022, 20, 1618–1631. [Google Scholar] [CrossRef]
- Taghizadeh, E.; Heydarheydari, S.; Saberi, A.; JafarpoorNesheli, S.; Rezaeijo, S.M. Breast cancer prediction with transcriptome profiling using feature selection and machine learning methods. BMC Bioinform. 2022, 23, 410. [Google Scholar] [CrossRef]
- Li, Q.; Yang, H.; Wang, P.; Liu, X.; Lv, K.; Ye, M. XGBoost-based and tumor-immune characterized gene signature for the prediction of metastatic status in breast cancer. J. Transl. Med. 2022, 20, 177. [Google Scholar] [CrossRef]
- Kurian, B.; Jyothi, V.L. Comparative Analysis of Machine Learning Methods for Breast Cancer Classification in Genetic Sequences. J. Environ. Public Health 2022, 2022, 7199290. [Google Scholar] [CrossRef]
- Tabl, A.A.; Alkhateeb, A.; ElMaraghy, W.; Rueda, L.; Ngom, A. A Machine Learning Approach for Identifying Gene Biomarkers Guiding the Treatment of Breast Cancer. Front. Genet. 2019, 10, 256. [Google Scholar] [CrossRef] [Green Version]
- Kim, B.-C.; Kim, J.; Lim, I.; Kim, D.H.; Lim, S.M.; Woo, S.-K. Machine Learning Model for Lymph Node Metastasis Prediction in Breast Cancer Using Random Forest Algorithm and Mitochondrial Metabolism Hub Genes. Appl. Sci. 2021, 11, 2897. [Google Scholar] [CrossRef]
- Sieuwerts, A.M.; Look, M.P.; Meijer-van Gelder, M.E.; Timmermans, M.; Trapman, A.M.A.C.; Garcia, R.R.; Arnold, M.; Goedheer, A.J.W.; de Weerd, V.; Portengen, H.; et al. Which Cyclin E Prevails as Prognostic Marker for Breast Cancer? Results from a Retrospective Study Involving 635 Lymph Node–Negative Breast Cancer Patients. Clin. Cancer Res. 2006, 12, 3319–3328. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, N.-Q.; Cao, W.-H.; Wang, X.; Chen, J.; Nie, J. Cyclin genes as potential novel prognostic biomarkers and therapeutic targets in breast cancer. Oncol. Lett. 2022, 24, 374. [Google Scholar] [CrossRef] [PubMed]
- Liu, R.; Guo, C.-X.; Zhou, H.-H. Network-based approach to identify prognostic biomarkers for estrogen receptor–positive breast cancer treatment with tamoxifen. Cancer Biol. Ther. 2015, 16, 317–324. [Google Scholar] [CrossRef] [Green Version]
- Weng, Y.; Liang, W.; Ji, Y.; Li, Z.; Jia, R.; Liang, Y.; Ning, P.; Xu, Y. Key Genes and Prognostic Analysis in HER2+ Breast Cancer. Technol. Cancer Res. Treat. 2021, 20, 1533033820983298. [Google Scholar] [CrossRef]
- Jiang, Y.; Liu, Y.; Tan, X.; Yu, S.; Luo, J. TPX2 as a Novel Prognostic Indicator and Promising Therapeutic Target in Triple-negative Breast Cancer. Clin. Breast Cancer 2019, 19, 450–455. [Google Scholar] [CrossRef]
- Abuderman, A.; Harb, O.; Gertallah, L. Prognostic and clinicopathological values of tissue expression of MFAP5 and ITM2A in triple-negative breast cancer: An immunohistochemical study. Contemp. Oncol./Współczesna Onkol. 2020, 24, 87–95. [Google Scholar] [CrossRef]
- Chen, D.; Sun, Y.; Wei, Y.; Zhang, P.; Rezaeian, A.H.; Teruya-Feldstein, J.; Gupta, S.; Liang, H.; Lin, H.-K.; Hung, M.-C.; et al. LIFR is a breast cancer metastasis suppressor upstream of the Hippo-YAP pathway and a prognostic marker. Nat. Med. 2012, 18, 1511–1517. [Google Scholar] [CrossRef] [Green Version]
- van Ijzendoorn, D.G.P.; Szuhai, K.; Briaire-de Bruijn, I.H.; Kostine, M.; Kuijjer, M.L.; Bovée, J.V.M.G. Machine learning analysis of gene expression data reveals novel diagnostic and prognostic biomarkers and identifies therapeutic targets for soft tissue sarcomas. PLoS Comput. Biol. 2019, 15, e1006826. [Google Scholar] [CrossRef]
- He, J.; Wu, M.; Xiong, L.; Gong, Y.; Yu, R.; Peng, W.; Li, L.; Li, L.; Tian, S.; Wang, Y.; et al. BTB/POZ zinc finger protein ZBTB16 inhibits breast cancer proliferation and metastasis through upregulating ZBTB28 and antagonizing BCL6/ZBTB27. Clin. Epigenet. 2020, 12, 82. [Google Scholar] [CrossRef]
- Hao, M.; Liu, W.; Ding, C.; Peng, X.; Zhang, Y.; Chen, H.; Dong, L.; Liu, X.; Zhao, Y.; Chen, X.; et al. Identification of hub genes and small molecule therapeutic drugs related to breast cancer with comprehensive bioinformatics analysis. PeerJ 2020, 8, e9946. [Google Scholar] [CrossRef]
Dataset | Title/Description | Normalization Methods | No. of Samples | Percentage of Cancer |
---|---|---|---|---|
GSE61304 | Novel biomarker discovery for stratification and prognosis of breast cancer patients | MAS5 signal intensity | 62 (58 breast tumor + 4 normal breast) | 94% |
GSE42568 | Breast cancer gene expression analysis | Log2 GCRMA signal intensity | 121 (104 breast tumor + 17 normal breast) | 86% |
GSE7904 | Expression data from human breast tissue | RMA expression value | 50 (43 breast tumor + 7 normal breast) | 86% |
GSE3744 | Human breast tumor expression | GCRMA calculated signal intensity, log2 transformed | 47 (40 breast tumor + 7 normal breast) | 85% |
GSE29431 | Identifying breast cancer biomarkers | RMA expression values | 66 (54 breast tumor + 12 normal breast) | 82% |
GSE26910 | Stromal molecular signatures of breast and prostate cancer | Log2 RMA signal | 12 (6 breast tumor + 6 normal breast) | 50% |
GSE31138 | Identifying novel anti-angiogenic targets in human breast cancer | Log2 RMA signal | 6 (3 breast tumor + 3 normal breast) | 50% |
GSE71053 | Differential effect of surgical manipulation on gene expression in normal breast tissue and breast tumour tissue | Log2-normalized signal | 18 (6 breast tumor + 12 normal breast) | 33% |
GSE10780 | Proliferative genes dominate malignancy risk gene signature in histologically normal breast tissue | RMA expression value | 185 (42 breast tumor + 143 normal breast) | 23% |
GSE30010 | Expression data from breast samples of postmenopausal women | RMA expression value | 107 (0 breast tumor + 107 normal breast) | 0% |
GSE111662 | Whole breast tissue gene expression in comparison to expression in epithelial and stromal tissues | RMA expression values | 27 (0 breast tumor + 27 normal breast) | 0% |
Total | 701 (356 breast tumors + 345 normal breasts) | 51% |
Gene Symbol | Gene Name | Log2FC | adj.p-Value | Decide Test |
---|---|---|---|---|
COL11A1 | Collagen Type XI Alpha 1 Chain | 4.36 | 1.69 × 10−172 | Upregulated |
TOP2A | DNA Topoisomerase II Alpha | 3.96 | 4.16 × 10−220 | Upregulated |
S100P | S100 Calcium-Binding Protein P | 3.70 | 3.57 × 10−137 | Upregulated |
COL10A1 | Collagen Type X Alpha 1 Chain | 3.59 | 7.47 × 10−192 | Upregulated |
RRM2 | Ribonucleotide Reductase Regulatory Subunit M2 | 3.47 | 4.44 × 10−205 | Upregulated |
CKS2 | CDC28 Protein Kinase Regulatory Subunit 2 | 3.26 | 8.66 × 10−201 | Upregulated |
MMP1 | Matrix Metallopeptidase 1 | 3.21 | 7.95 × 10−113 | Upregulated |
COMP | Cartilage Oligomeric Matrix Protein | 3.15 | 6.27 × 10−137 | Upregulated |
NUSAP1 | Nucleolar And Spindle-Associated Protein 1 | 3.08 | 9.67 × 10−194 | Upregulated |
ANLN | Anillin, Actin-Binding Protein | 3.07 | 2.42 × 10−173 | Upregulated |
ADH1B | Alcohol Dehydrogenase 1B (Class I), Beta Polypeptide | −4.84 | 2.10 × 10−168 | Downregulated |
ADIPOQ | Adiponectin, C1Q And Collagen Domain Containing | −4.47 | 6.08 × 10−119 | Downregulated |
PLIN1 | Perilipin 1 | −4.20 | 8.42 × 10−161 | Downregulated |
LEP | Leptin | −4.11 | 7.20 × 10−105 | Downregulated |
LPL | Lipoprotein Lipase | −4.09 | 2.35 × 10−130 | Downregulated |
SDPR | Serum Deprivation Response | −4.06 | 1.34 × 10−201 | Downregulated |
RBP4 | Retinol Binding Protein 4, Plasma | −4.06 | 4.37 × 10−118 | Downregulated |
C2orf40 | Chromosome 2 Open Reading Frame 40 | −4.05 | 3.15 × 10−211 | Downregulated |
ABCA8 | Atp-Binding Cassette Subfamily A Member 8 | −4.05 | 4.21 × 10−166 | Downregulated |
NTRK2 | Neurotrophic Tyrosine Kinase, Receptor, Type 2 | −4.04 | 4.78 × 10−181 | Downregulated |
Gene Set | Description | Gene Set Size | Expect Values | Overlap Value | Enrichment Ratio | FDR |
---|---|---|---|---|---|---|
GO:0031012 | Extracellular matrix | 487 | 09.48 | 35 | 3.69 | 3.31 × 10−8 |
GO:0051301 | Cell division | 576 | 11.21 | 41 | 3.65 | 2.69 × 10−9 |
GO:1903047 | Mitotic cell cycle process | 780 | 15.18 | 46 | 3.02 | 2.62 × 10−8 |
GO:0016477 | Cell migration | 1352 | 26.32 | 68 | 2.58 | 1.45 × 10−9 |
GO:0042127 | Regulation of cell proliferation | 1535 | 29.88 | 72 | 2.40 | 3.75 × 10−9 |
GO:0048870 | Cell motility | 1493 | 29.06 | 69 | 2.37 | 1.64 × 10−8 |
GO:0051674 | Localization of cell | 1493 | 29.06 | 69 | 2.37 | 1.64 × 10−8 |
GO:0009719 | Response to endogenous stimulus | 1574 | 30.64 | 71 | 2.31 | 2.01 × 10−8 |
GO:0008283 | Cell proliferation | 1953 | 38.02 | 87 | 2.28 | 6.20 × 10−10 |
GO:0009888 | Tissue development | 1814 | 35.31 | 77 | 2.18 | 3.31 × 10−8 |
ML Model | Mean AUC | Mean ACC | Mean Precision | Mean Recall | Mean F1 |
---|---|---|---|---|---|
KNN | 0.989 | 0.981 | 0.983 | 0.980 | 0.982 |
GBDT | 0.995 | 0.973 | 0.970 | 0.978 | 0.973 |
AdaBoost | 0.992 | 0.974 | 0.972 | 0.977 | 0.975 |
XGBoost | 0.994 | 0.971 | 0.969 | 0.975 | 0.972 |
MLP | 0.975 | 0.960 | 0.961 | 0.961 | 0.960 |
Gene Symbol | Probe_IDs | HR | CI | Log-Rank p-Value | Decision (Log-Rank p-Value) |
---|---|---|---|---|---|
ADAMTS5 | 219935_at | 0.9 | 0.77–0.94 | 1.50 × 10−3 | Significant |
CCNE2 | 205034_at | 1.9 | 1.67–2.06 | 1.00 × 10−16 | Significant |
CKS2 | 204170_s_at | 1.7 | 1.51–1.85 | 1.00 × 10−16 | Significant |
CXCL10 | 204533_at | 1.2 | 1.12–1.37 | 4.40 × 10−5 | Significant |
EDNRB | 206701_x_at | 0.8 | 0.69–0.85 | 2.20 × 10−7 | Significant |
FABP4 | 203980_at | 0.9 | 0.81–0.99 | 2.58 × 10−2 | Significant |
GPC3 | 209220_at | 0.8 | 0.76–0.92 | 5.00 × 10−4 | Significant |
ITM2A | 202747_s_at | 0.7 | 0.63–0.77 | 1.40 × 10−12 | Significant |
LIFR | 225575_at | 0.7 | 0.56–0.75 | 1.60 × 10−8 | Significant |
MATN2 | 202350_s_at | 0.9 | 0.78–0.95 | 3.30 × 10−3 | Significant |
LYVE1 | 219059_s_at | 0.9 | 0.81–0.99 | 3.78 × 10−2 | Significant |
NUSAP1 | 218039_at | 1.7 | 1.54–1.89 | 1.00 × 10−16 | Significant |
SCN4B | 236359_at | 0.6 | 0.55–0.75 | 1.00 × 10−8 | Significant |
SDPR | 222717_at | 0.7 | 0.57–0.77 | 9.70 × 10−8 | Significant |
SPRY2 | 204011_at | 0.9 | 0.79–0.97 | 1.02 × 10−2 | Significant |
TF | 214063_s_at | 0.9 | 0.78–0.96 | 5.20 × 10−3 | Significant |
TNXA | 216333_x_at | 0.7 | 0.63–0.77 | 2.40 × 10−12 | Significant |
TPX2 | 210052_s_at | 1.6 | 1.48–1.82 | 1.00 × 10−16 | Significant |
WISP1 | 229802_at | 0.8 | 0.64–0.87 | 1.00 × 10−4 | Significant |
ZBTB16 | 205883_at | 0.7 | 0.58–0.72 | 1.00 × 10−16 | Significant |
COL11A1 | 37892_at | 1.2 | 1.12–1.38 | 2.30 × 10−5 | Significant |
INHBA | 210511_s_at | 1.2 | 1.06–1.3 | 1.70 × 10−3 | Significant |
S100P | 204351_at | 1.5 | 1.31–1.61 | 6.30 × 10−3 | Significant |
COL10A1 | 205941_s_at | 1.0 | 0.88–1.08 | 6.60 × 10−1 | Insignificant |
COMP | 205713_s_at | 0.9 | 0.85–1.04 | 2.53 × 10−1 | Insignificant |
GJB2 | 223278_at | 1.0 | 0.88–1.19 | 7.89 × 10−1 | Insignificant |
LRRC15 | 213909_at | 0.9 | 0.82–1.01 | 7.16 × 10−2 | Insignificant |
MME | 203435_s_at | 1.1 | 0.98–1.2 | 1.28 × 10−1 | Insignificant |
Gene SymboL | Probe_IDs | HR | CI | Log-Rank p-Value | Decision (Log-Rank p-Value) |
---|---|---|---|---|---|
CCNE2 | 205034_at | 1.47 | 1.22–1.78 | 5.00 × 10−5 | Significant |
CKS2 | 204170_s_at | 1.32 | 1.09–1.59 | 3.70 × 10−3 | Significant |
ITM2A | 202747_s_at | 0.61 | 0.5–0.73 | 2.00 × 10−7 | Significant |
LIFR | 225575_at | 0.59 | 0.45–0.78 | 1.00 × 10−4 | Significant |
NUSAP1 | 218039_at | 1.65 | 1.36–2 | 1.90 × 10−7 | Significant |
SDPR | 222717_at | 0.70 | 0.53–0.92 | 8.90 × 10−3 | Significant |
TNXA | 216333_x_at | 0.71 | 0.59–0.85 | 3.00 × 10−4 | Significant |
TPX2 | 210052_s_at | 1.56 | 1.29–1.89 | 3.20 × 10−6 | Significant |
ZBTB16 | 205883_at | 0.63 | 0.52–0.76 | 1.40 × 10−6 | Significant |
S100P | 204351_at | 1.50 | 1.25–1.82 | 2.10 × 10−5 | Significant |
ADAMTS5 | 219935_at | 0.85 | 0.7–1.02 | 8.00 × 10−2 | Insignificant |
COL10A1 | 205941_s_at | 0.96 | 0.79–1.15 | 6.38 × 10−1 | Insignificant |
COMP | 205713_s_at | 1.06 | 0.88–1.27 | 5.67 × 10−1 | Insignificant |
CXCL10 | 204533_at | 0.9 | 0.75–1.09 | 2.98 × 10−1 | Insignificant |
EDNRB | 206701_x_at | 0.88 | 0.73–1.06 | 1.85 × 10−1 | Insignificant |
FABP4 | 203980_at | 0.84 | 0.7–1.02 | 7.78 × 10−2 | Insignificant |
GJB2 | 223278_at | 1.18 | 0.9–1.54 | 2.29 × 10−1 | Insignificant |
GPC3 | 209220_at | 0.84 | 0.7–1.02 | 7.47 × 10−2 | Insignificant |
MATN2 | 202350_s_at | 0.85 | 0.7–1.02 | 8.10 × 10−2 | Insignificant |
LRRC15 | 213909_at | 0.87 | 0.72–1.04 | 1.30 × 10−1 | Insignificant |
LYVE1 | 219059_s_at | 1.04 | 0.86–1.25 | 6.90 × 10−1 | Insignificant |
MME | 203435_s_at | 0.83 | 0.69–1.01 | 5.95 × 10−2 | Insignificant |
SCN4B | 236359_at | 0.83 | 0.63–1.08 | 1.68 × 10−1 | Insignificant |
SPRY2 | 204011_at | 0.89 | 0.74–1.08 | 2.36 × 10−1 | Insignificant |
TF | 214063_s_at | 0.99 | 0.82–1.19 | 9.08 × 10−1 | Insignificant |
WISP1 | 229802_at | 0.79 | 0.6–1.03 | 8.33 × 10−2 | Insignificant |
COL11A1 | 37892_at | 1.12 | 0.93–1.35 | 2.37 × 10−1 | Insignificant |
INHBA | 210511_s_at | 1.12 | 0.93–1.36 | 0.2212 | Insignificant |
CKS2 | 204170_s_at | 1.32 | 1.09–1.59 | 0.0612 | Insignificant |
SDPR | 222717_at | 0.7 | 0.53–0.92 | 0.0628 | Insignificant |
Survival Type | Gene Hub | HR | CI | Log-Rank p-Value | Decision by Log-Rank p-Value | Expression |
---|---|---|---|---|---|---|
RFS | CCNE2, NUSAP1, TPX2, S100P | 1.62 | 1.46–1.79 | 1.00 × 10−16 | Significant | Upregulated |
RFS | ITM2A, LIFR, TNXA-TNXB, ZBTB16 | 0.58 | 0.50–0.68 | 1.90 × 10−12 | Significant | Downregulated |
OS | CCNE2, NUSAP1, TPX2, S100P | 1.44 | 1.19–1.74 | 0.00014 | Significant | Upregulated |
OS | ITM2A, LIFR, TNXA-TNXB, ZBTB16 | 0.57 | 0.43–0.75 | 4.20 × 10−5 | Significant | Downregulated |
ML Model | Mean_AUC | Mean_ACC | Mean_Precision | Mean_Recall | Mean_F1 |
---|---|---|---|---|---|
GBDT | 0.993 | 0.980 | 0.983 | 0.977 | 0.980 |
XGBoost | 0.992 | 0.976 | 0.981 | 0.972 | 0.976 |
AdaBoost | 0.987 | 0.967 | 0.965 | 0.972 | 0.968 |
KNN | 0.985 | 0.979 | 0.978 | 0.980 | 0.979 |
MLP | 0.979 | 0.966 | 0.975 | 0.958 | 0.966 |
Gene | Microarray | qRT-PCR | ||||
---|---|---|---|---|---|---|
FC | adj.p-Value | Rq | FC | StdDev | p-Value | |
Expression of Diagnostic Gene Signature | ||||||
COL11A1 | 3.907519 | 5.1 × 10−163 | 9.834254 | 3.297816 | 0.64495 | 3.47 × 10−7 |
COL10A1 | 3.842333 | 2.1 × 10−178 | 10.30614 | 3.365432 | 0.660567 | 1.84 × 10−8 |
S100P | 3.701498 | 3.6 × 10−137 | 5.345665 | 2.418369 | 0.995349 | 2.73 × 10−11 |
COMP | 3.150415 | 6.3 × 10−137 | 6.191366 | 2.630258 | 0.425433 | 5.81 × 10−15 |
INHBA | 3.042628 | 2.6 × 10−157 | 7.21937 | 2.851873 | 0.901961 | 1.97 × 10−8 |
WISP1 | 2.551547 | 6 × 10−105 | 4.783692 | 2.258124 | 0.461439 | 2.9 × 10−8 |
ADAMTS5 | −3.13169 | 3.3 × 10−184 | 0.209914 | −2.25213 | 0.511629 | 2.88 × 10−9 |
CXCL10 | 2.530934 | 2.16 × 10−95 | 4.763343 | 2.251974 | 0.86944 | 2.16 × 10−5 |
LYVE1 | −3.14204 | 1.2 × 10−142 | 0.177332 | −2.49548 | 0.877592 | 1.73 × 10−6 |
Expression of Prognostic Gene Signature | ||||||
CCNE2 | 2.530327 | 3.7 × 10−154 | 5.112678 | 2.354079 | 0.392123 | 3.1 × 10−15 |
NUSAP1 | 2.732299 | 2.4 × 10−124 | 7.065653 | 2.820823 | 0.9127 | 3.81 × 10−10 |
TPX2 | 2.145025 | 5.6 × 10−135 | 5.179436 | 2.372795 | 0.432888 | 4.72 × 10−10 |
ITM2A | −2.54576 | 9.1 × 10−149 | 0.241341 | −2.05085 | 0.683145 | 4.21 × 10−8 |
LIFR | −3.0494 | 5.6 × 10−159 | 0.271185 | −1.88265 | 0.853309 | 1.78 × 10−6 |
TNXA | −2.54523 | 1.9 × 10−129 | 0.228871 | −2.12739 | 0.736176 | 7.89 × 10−7 |
ZBTB16 | −2.4943 | 1.12 × 10−115 | 0.187856 | −2.4123 | 0.567998 | 3.32 × 10−9 |
S100P | 3.701498 | 3.6 × 10−137 | 5.345665 | 2.418369 | 0.995349 | 2.73 × 10−11 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mirza, Z.; Ansari, M.S.; Iqbal, M.S.; Ahmad, N.; Alganmi, N.; Banjar, H.; Al-Qahtani, M.H.; Karim, S. Identification of Novel Diagnostic and Prognostic Gene Signature Biomarkers for Breast Cancer Using Artificial Intelligence and Machine Learning Assisted Transcriptomics Analysis. Cancers 2023, 15, 3237. https://doi.org/10.3390/cancers15123237
Mirza Z, Ansari MS, Iqbal MS, Ahmad N, Alganmi N, Banjar H, Al-Qahtani MH, Karim S. Identification of Novel Diagnostic and Prognostic Gene Signature Biomarkers for Breast Cancer Using Artificial Intelligence and Machine Learning Assisted Transcriptomics Analysis. Cancers. 2023; 15(12):3237. https://doi.org/10.3390/cancers15123237
Chicago/Turabian StyleMirza, Zeenat, Md Shahid Ansari, Md Shahid Iqbal, Nesar Ahmad, Nofe Alganmi, Haneen Banjar, Mohammed H. Al-Qahtani, and Sajjad Karim. 2023. "Identification of Novel Diagnostic and Prognostic Gene Signature Biomarkers for Breast Cancer Using Artificial Intelligence and Machine Learning Assisted Transcriptomics Analysis" Cancers 15, no. 12: 3237. https://doi.org/10.3390/cancers15123237
APA StyleMirza, Z., Ansari, M. S., Iqbal, M. S., Ahmad, N., Alganmi, N., Banjar, H., Al-Qahtani, M. H., & Karim, S. (2023). Identification of Novel Diagnostic and Prognostic Gene Signature Biomarkers for Breast Cancer Using Artificial Intelligence and Machine Learning Assisted Transcriptomics Analysis. Cancers, 15(12), 3237. https://doi.org/10.3390/cancers15123237