[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3498731.3498742acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbsConference Proceedingsconference-collections
research-article

Stacking Ensemble Method for Early and Advanced Stage Lung Adenocarcinoma Classification Based on miRNA Expression

Published: 26 January 2022 Publication History

Abstract

Lung cancer and its various types are a leading cause of death across the globe. Many studies have pointed out that microRNAs (miRNAs) dysregulation can be a useful marker for variety of cancers, including lung cancer. Successful treatment of all cancers depends on clinical expertise, treatment resources, and the stage at the time of diagnosis. Therefore, we made an effort to find a novel miRNA expression marker to determine the stage of lung adenocarcinoma (LUAD). In this manuscript, we proposed a stack ensemble method for classifying early and advanced stage LUAD using miRNA expression data. In our benchmark dataset, 445 were early-stage, and 114 were advanced-stage LUAD patients. The benchmark dataset was imbalanced, so to balance our dataset, we used Synthetic Minority Over Sampling Technique (SMOTE). We then divided the balanced LUAD patient’s dataset into training dataset (80%) and testing dataset (20%). Random Forest (RF) technique was implemented for the selection of best optimal features (miRNA sequence expression) out of 1880 miRNAs, followed by machine learning (ML) Stack ensemble method to classify the early and advanced stage LUAD. Compared to the traditional ML classifier used as a baseline, the stack ensemble method classified the early and advanced stage LUAD more efficiently with 99% accuracy. The proposed method’s precision for early-stage LUAD was 92% and for advance stage LUAD 84%. Similarly, the recall of the proposed method for early and advanced stage LUAD was 82% and 93%, respectively. The F1-Score of the proposed method for early and advanced stage LUAD was 87% and 88%, respectively. To conclude, the results obtained clearly showed the effectiveness of ensemble method for the classification of early and advanced stage LUAD using miRNA expression data. The top 10 miRNAs sequences identified by the model can help make the best treatment decisions for early and advanced stage LUAD to increase the chances of survival.

References

[1]
OĞUZHAN AYYILDIZ, Zafer Aydin, Bülent Yilmaz, SEYHAN KARAÇAVUŞ, KÜBRA ŞENKAYA, Semra Icer, ERDEM ARZU TAŞDEMİR, and Eser Kaya. 2020. Lung cancer subtype differentiation from positron emission tomography images. Turkish Journal of Electrical Engineering & Computer Sciences 28, 1(2020), 262–274.
[2]
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5–32.
[3]
Zhihua Cai, Dong Xu, Qing Zhang, Jiexia Zhang, Sai-Ming Ngai, and Jianlin Shao. 2015. Classification of lung cancer using ensemble-based feature selection and machine learning methods. Molecular BioSystems 11, 3 (2015), 791–800.
[4]
Darshan S Chandrashekar, Bhuwan Bashel, Sai Akshaya Hodigere Balasubramanya, Chad J Creighton, Israel Ponce-Rodriguez, Balabhadrapatruni VSK Chakravarthi, and Sooryanarayana Varambally. 2017. UALCAN: a portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia 19, 8 (2017), 649–658.
[5]
Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321–357.
[6]
Xi Chen, Yi Ba, Lijia Ma, Xing Cai, Yuan Yin, Kehui Wang, Jigang Guo, Yujing Zhang, Jiangning Chen, Xing Guo, 2008. Characterization of microRNAs in serum: a novel class of biomarkers for diagnosis of cancer and other diseases. Cell research 18, 10 (2008), 997–1006.
[7]
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273–297.
[8]
Darcie AP Delzell, Sara Magnuson, Tabitha Peter, Michelle Smith, and Brian J Smith. 2019. Machine learning and feature selection methods for disease classification with application to lung cancer screening image data. Frontiers in oncology 9(2019), 1393.
[9]
Zhenyun Deng, Xiaoshu Zhu, Debo Cheng, Ming Zong, and Shichao Zhang. 2016. Efficient kNN classification algorithm for big data. Neurocomputing 195(2016), 143–148.
[10]
Saso Džeroski and Bernard Ženko. 2004. Is combining classifiers with stacking better than selecting the best one?Machine learning 54, 3 (2004), 255–273.
[11]
David W Hosmer Jr, Stanley Lemeshow, and Rodney X Sturdivant. 2013. Applied logistic regression. Vol. 398. John Wiley & Sons.
[12]
Lin Huang, Lin Wang, Xiaomeng Hu, Sen Chen, Yunwen Tao, Haiyang Su, Jing Yang, Wei Xu, Vadanasundari Vedarethinam, Shu Wu, 2020. Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma. Nature communications 11, 1 (2020), 1–11.
[13]
Mohammad Askandar Iqbal, Shweta Arora, Gopinath Prakasam, George A Calin, and Mansoor Ali Syed. 2019. MicroRNA in lung cancer: role, mechanisms, pathways and therapeutic relevance. Molecular aspects of medicine 70 (2019), 3–20.
[14]
Qinghua Jiang, Yadong Wang, Yangyang Hao, Liran Juan, Mingxiang Teng, Xinjun Zhang, Meimei Li, Guohua Wang, and Yunlong Liu. 2009. miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic acids research 37, suppl_1 (2009), D98–D104.
[15]
Konstantina Kourou, Themis P Exarchos, Konstantinos P Exarchos, Michalis V Karamouzis, and Dimitrios I Fotiadis. 2015. Machine learning applications in cancer prognosis and prediction. Computational and structural biotechnology journal 13 (2015), 8–17.
[16]
Nilubon Kurubanjerdjit and Ka-Lok Ng. 2017. MicroRNA-Regulated Network Motifs with Drug Association in Lung Cancer. In Proceedings of the 6th International Conference on Bioinformatics and Biomedical Science. 1–6.
[17]
Yin Li, Di Ge, Jie Gu, Fengkai Xu, Qiaoliang Zhu, and Chunlai Lu. 2019. A large cohort study identifying a novel prognosis prediction model for lung adenocarcinoma through machine learning strategies. BMC cancer 19, 1 (2019), 1–14.
[18]
Baoshan Ma, Yao Geng, Fanyu Meng, Ge Yan, and Fengju Song. 2020. Identification of a sixteen-gene prognostic biomarker for lung adenocarcinoma using a machine learning method. Journal of Cancer 11, 5 (2020), 1288.
[19]
Mitra Montazeri, Mohadeseh Montazeri, Mahdieh Montazeri, and Amin Beigzadeh. 2016. Machine learning models in breast cancer survival prediction. Technology and Health Care 24, 1 (2016), 31–42.
[20]
Charles M Rudin, Elisabeth Brambilla, Corinne Faivre-Finn, and Julien Sage. 2021. Small-cell lung cancer. Nature Reviews Disease Primers 7, 1 (2021), 1–20.
[21]
Srinivasulu Yerukala Sathipati and Shinn-Ying Ho. 2020. Novel miRNA signature for predicting the stage of hepatocellular carcinoma. Scientific reports 10, 1 (2020), 1–12.
[22]
Masih Sherafatian and Fateme Arjmand. 2019. Decision tree-based classifiers for lung cancer diagnosis and subtyping using TCGA miRNA expression data. Oncology letters 18, 2 (2019), 2125–2131.
[23]
Ioannis S Vlachos, Konstantinos Zagganas, Maria D Paraskevopoulou, Georgios Georgakilas, Dimitra Karagkouni, Thanasis Vergoulis, Theodore Dalamagas, and Artemis G Hatzigeorgiou. 2015. DIANA-miRPath v3. 0: deciphering microRNA function with experimental support. Nucleic acids research 43, W1 (2015), W460–W466.
[24]
LL Wang and M Zhang. 2018. miR-582-5p is a potential prognostic marker in human non-small cell lung cancer and functions as a tumor suppressor by targeting MAP3K2. Eur Rev Med Pharmacol Sci 22, 22 (2018), 7760–7767.
[25]
Nozomu Yanaihara, Natasha Caplen, Elise Bowman, Masahiro Seike, Kensuke Kumamoto, Ming Yi, Robert M Stephens, Aikou Okamoto, Jun Yokota, Tadao Tanaka, 2006. Unique microRNA molecular profiles in lung cancer diagnosis and prognosis. Cancer cell 9, 3 (2006), 189–198.
[26]
Rongjiong Zheng, Wenjie Mao, Zhennan Du, Jun Zhang, Mingming Wang, and Meiling Hu. 2018. Three differential expression profiles of miRNAs as potential biomarkers for lung adenocarcinoma. Biochemical and biophysical research communications 507, 1-4(2018), 377–382.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICBBS '21: Proceedings of the 2021 10th International Conference on Bioinformatics and Biomedical Science
October 2021
207 pages
ISBN:9781450384308
DOI:10.1145/3498731
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 January 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Lung Adenocarcinoma
  2. Machine learning
  3. Stack ensemble method
  4. miRNA.

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ICBBS 2021

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 35
    Total Downloads
  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media