Computer Science > Machine Learning
[Submitted on 14 Feb 2024]
Title:Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach
View PDF HTML (experimental)Abstract:Lung cancer (LC) remains the primary cause of cancer-related mortality, largely due to late-stage diagnoses. Effective strategies for early detection are therefore of paramount importance. In recent years, machine learning (ML) has demonstrated considerable potential in healthcare by facilitating the detection of various diseases. In this retrospective development and validation study, we developed an ML model based on dynamic ensemble selection (DES) for LC detection. The model leverages standard blood sample analysis and smoking history data from a large population at risk in Denmark. The study includes all patients examined on suspicion of LC in the Region of Southern Denmark from 2009 to 2018. We validated and compared the predictions by the DES model with diagnoses provided by five pulmonologists. Among the 38,944 patients, 9,940 had complete data of which 2,505 (25\%) had LC. The DES model achieved an area under the roc curve of 0.77$\pm$0.01, sensitivity of 76.2\%$\pm$2.4\%, specificity of 63.8\%$\pm$2.3\%, positive predictive value of 41.6\%$\pm$1.2\%, and F\textsubscript{1}-score of 53.8\%$\pm$1.1\%. The DES model outperformed all five pulmonologists, achieving a sensitivity 9\% higher than their average. The model identified smoking status, age, total calcium levels, neutrophil count, and lactate dehydrogenase as the most important factors for the detection of LC. The results highlight the successful application of the ML approach in detecting LC, surpassing pulmonologists' performance. Incorporating clinical and laboratory data in future risk assessment models can improve decision-making and facilitate timely referrals.
Submission history
From: Abdolrahman Peimankar [view email][v1] Wed, 14 Feb 2024 22:00:57 UTC (1,616 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender
(What is IArxiv?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.