More Web Proxy on the site http://driver.im/

Article

Local vs. Global Interpretability of Machine Learning Models in Type 2 Diabetes Mellitus Screening

Authors:

Gregor StiglicAuthors Info & Claims

Artificial Intelligence in Medicine: Knowledge Representation and Transparent and Explainable Systems: AIME 2019 International Workshops, KR4HC/ProHealth and TEAAM, Poznan, Poland, June 26–29, 2019, Revised Selected Papers

Pages 108 - 119

https://doi.org/10.1007/978-3-030-37446-4_9

Published: 26 June 2019 Publication History

Abstract

Machine learning based predictive models have been used in different areas of everyday life for decades. However, with the recent availability of big data, new ways emerge on how to interpret the decisions of machine learning models. In addition to global interpretation focusing on the general prediction model decisions, this paper emphasizes the importance of local interpretation of predictions. Local interpretation focuses on specifics of each individual and provides explanations that can lead to a better understanding of the feature contribution in smaller groups of individuals that are often overlooked by the global interpretation techniques. In this paper, three machine learning based prediction models were compared: Gradient Boosting Machine (GBM), Random Forest (RF) and Generalized linear model with regularization (GLM). No significant differences in prediction performance, measured by mean average error, were detected: GLM: 0.573 (0.569 − 0.577); GBM: 0.579 (0.575 − 0.583); RF: 0.579 (0.575 − 0.583). Similar to other studies that used prediction models for screening in type 2 diabetes mellitus, we found a strong contribution of features like age, gender and BMI on the global interpretation level. On the other hand, local interpretation technique discovered some features like depression, smoking status or physical activity that can be influential in specific groups of patients. This study outlines the prospects of using local interpretation techniques to improve the interpretability of prediction models in the era of personalized healthcare. At the same time, we try to warn the users and developers of prediction models that prediction performance should not be the only criteria for model selection.

References

[1]

h2o: R Interface for ‘H2O’. R package version 3.22.1.1. Tech. rep. (2019). https://cran.r-project.org/package=h2o

[2]

Ahmad, M.A., Teredesai, A., Eckert, C.: Interpretable machine learning in healthcare. In: Proceedings - 2018 IEEE International Conference on Healthcare Informatics, ICHI 2018, p. 447 (2018). 10.1109/ICHI.2018.00095

[3]

Bang H, Edwards AM, Bomback AS, Ballantyne CM, Brillon D, Callahan MA, Teutsch SM, Mushlin AI, and Kern LM Development and validation of a patient self-assessment score for diabetes risk Ann. Intern. Med. 2009 151 11 775-783

[4]

Breiman LRandom forestsMach. Learn.20014515-321007.68152

[5]

van Buuren S and Groothuis-Oudshoorn K Mice: multivariate imputation by chained equations in R J. Stat. Softw. 2011 45 3 1-67

[6]

Collins GS, Mallett S, Omar O, and Yu LM Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting BMC Med. 2011 9 1 103

[7]

Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. Tech. rep. (2017). http://arxiv.org/abs/1702.08608

[8]

Du, M., Liu, N., Hu, X.: Techniques for Interpretable Machine Learning. Tech. rep. (2018). http://arxiv.org/abs/1808.00033

[9]

Fijacko N, Brzan PP, and Stiglic G Mobile applications for type 2 diabetes risk estimation: a systematic review J. Med. Syst. 2015 39 10 124

[10]

Fisher, A., Rudin, C., Dominici, F.: Model class reliance: variable importance measures for any machine learning model class, from the “Rashomon” Perspective. arXiv (2018)

[11]

Friedman JHGreedy function approximation: a gradient boosting machineAnn. Stat.20012951189-12321873328

[12]

Goodman B and Flaxman S European union regulations on algorithmic decision-making and a “Right to Explanation” AI Mag. 2017 38 3 50-57

[13]

Hippisley-Cox J and Coupland C Development and validation of QDiabetes-2018 risk prediction algorithm to estimate future risk of type 2 diabetes: cohort study BMJ Clin. Res. ed. 2017 359 j5019

[14]

Kaczorowski J, Robinson C, and Nerenberg K Development of the CANRISK questionnaire to screen for prediabetes and undiagnosed type 2 diabetes Can. J. Diab. 2009 33 4 381-385

[15]

Landry, M., Bartz, A., Aiello, S., Eckstrand, E., Fu, A., Aboyoun, P.: Machine learning with R and H2O: seventh edition machine learning with R and H2O by Mark Landry with assistance from Spencer Aiello, Eric Eckstrand, Anqi Fu, & Patrick Aboyoun. Tech. rep. (2018). http://h2o.ai/resources/

[16]

Lindström J and Tuomilehto J The diabetes risk score: a practical tool to predict type 2 diabetes risk Diabetes Care 2003 26 3 725-731

[17]

Lundberg, S., Lee, S.I.: An unexpected unity among methods for interpreting model predictions (2016). http://arxiv.org/abs/1611.07478

[18]

Molnar C iml: an R package for interpretable machine learning J. Open Source Softw. 2018 3 26 786

[19]

Molnar, C.: Interpretable machine learning. a guide for making black box models explainable (2019). https://christophm.github.io/interpretable-ml-book

[20]

Montavon G, Samek W, and Müller KRMethods for interpreting and understanding deep neural networksDigital Signal Proc.2018731-153737870

[21]

Narayan KV Type 2 diabetes: why we are winning the battle but losing the war? 2015 Kelly West Award lecture Diabetes Care 2016 39 5 653-663

[22]

Ogurtsova K et al. IDF diabetes atlas: global estimates for the prevalence of diabetes for 2015 and 2040 Diabetes Res. Clin. Pract. 2017 128 40-50

[23]

Olivera AR et al. Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study Sao Paulo Med. J. 2017 135 3 234-246

[24]

R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing (2016). 10.1007/978-3-540-74686-7

[25]

Slack, D., Friedler, S.A., Scheidegger, C., Roy, C.D.: Assessing the local interpretability of machine learning models (2019), http://arxiv.org/abs/1902.03501

[26]

Štiglic G, Fijačko N, Stožer A, Sheikh A, and Pajnkihar M Validation of the finnish diabetes risk score (FINDRISC) questionnaire for undiagnosed type 2 diabetes screening in the Slovenian working population Diabetes Res. Clin. Pract. 2016 120 194-197

[27]

Stiglic, G., Mertik, M., Podgorelec, V., Kokol, P.: Using visual interpretation of small ensembles in microarray analysis. In: Proceedings - IEEE Symposium on Computer-Based Medical Systems, vol. 2006, pp. 691–695. IEEE (2006). 10.1109/CBMS.2006.169

[28]

Štrumbelj E and Kononenko I Explaining prediction models and individual predictions with feature contributions Knowl. Inf. Syst. 2014 41 3 647-665

[29]

Tibshiranit RRegression shrinkage and selection via the LassoJ. R. Statist. Soc. B1996581267-2881379242

[30]

Alfredo, V.: The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput. Appl. 1–15. 10.1007/s00521-019-04051-w

[31]

Zimmet PZ, Magliano DJ, Herman WH, and Shaw JE Diabetes: a 21st century challenge Lancet Diabetes Endocrinol. 2014 2 1 56-64

Cited By

Kopitar LKokol PStiglic G(2024)Hybrid visualization-based framework for depressive state detection and characterization of atypical patientsJournal of Biomedical Informatics10.1016/j.jbi.2023.104535147:COnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.jbi.2023.104535
Wrazen WGontarska KGrzelka FPolze A(2023)Explainable AI for Medical Event Prediction for Heart Failure PatientsArtificial Intelligence in Medicine10.1007/978-3-031-34344-5_12(97-107)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1007/978-3-031-34344-5_12
Ooge JVerbert K(2022)Explaining Artificial Intelligence with Tailored Interactive VisualisationsCompanion Proceedings of the 27th International Conference on Intelligent User Interfaces10.1145/3490100.3516481(120-123)Online publication date: 22-Mar-2022
https://dl.acm.org/doi/10.1145/3490100.3516481

Recommendations

Enhancing the prediction of type 2 diabetes mellitus using sparse balanced SVM
Abstract
The natural population-based prediction of type 2 diabetes is costly since it needs a high number of resources. Even though much research has used machine learning algorithms to predict type II diabetes, it could not obtain a sufficient ...
Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records
Highlights
- Information from electronic health records can be used to help us understand what contributes to the onset of diseases including type 2 diabetes mellitus.
Abstract Objective
Diabetes is responsible for considerable morbidity, healthcare utilisation and mortality in both developed and developing countries. Currently, methods of treating diabetes are inadequate and costly so prevention ...
A patient network-based machine learning model for disease prediction: The case of type 2 diabetes mellitus
Abstract
In recent years, the prevalence of chronic diseases such as type 2 diabetes mellitus (T2DM) has increased, bringing a heavy burden to healthcare systems. While regular monitoring of patients is expensive and impractical, understanding chronic ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Artificial Intelligence in Medicine: Knowledge Representation and Transparent and Explainable Systems: AIME 2019 International Workshops, KR4HC/ProHealth and TEAAM, Poznan, Poland, June 26–29, 2019, Revised Selected Papers

Jun 2019

183 pages

ISBN:978-3-030-37445-7

DOI:10.1007/978-3-030-37446-4

Editors:
Mar Marcos
Universitat Jaume I, Castellón, Spain
,
Jose M. Juarez
University of Murcia, Murcia, Spain
,
Richard Lenz
Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen, Germany
,
Grzegorz J. Nalepa
Jagiellonian University and AGH University of Science and Technology, Kraków, Poland
,
Slawomir Nowaczyk
Halmstad University, Halmstad, Sweden
,
Mor Peleg
University of Haifa, Haifa, Israel
,
Jerzy Stefanowski
Poznań University of Technology, Poznan, Poland
,
Gregor Stiglic
University of Maribor, Maribor, Slovenia

© Springer Nature Switzerland AG 2019.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 26 June 2019

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kopitar LKokol PStiglic G(2024)Hybrid visualization-based framework for depressive state detection and characterization of atypical patientsJournal of Biomedical Informatics10.1016/j.jbi.2023.104535147:COnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.jbi.2023.104535
Wrazen WGontarska KGrzelka FPolze A(2023)Explainable AI for Medical Event Prediction for Heart Failure PatientsArtificial Intelligence in Medicine10.1007/978-3-031-34344-5_12(97-107)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1007/978-3-031-34344-5_12
Ooge JVerbert K(2022)Explaining Artificial Intelligence with Tailored Interactive VisualisationsCompanion Proceedings of the 27th International Conference on Intelligent User Interfaces10.1145/3490100.3516481(120-123)Online publication date: 22-Mar-2022
https://dl.acm.org/doi/10.1145/3490100.3516481

View Options

View options

Figures

Tables

Media

View Table of Conten