[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A predictive analytics framework for identifying patients at risk of developing multiple medical complications caused by chronic diseases

Published: 01 November 2019 Publication History

Highlights

Patients with chronic diseases are often at risk for multiple correlated complications.
Single-task learning predicts these complications but ignores their correlations.
We use single- and multi-task learning with different predictive models.
We compare prediction performance of hypertrophic cardiomyopathy complications.
We show multi-task learning implemented by logistic regression has the best performance.

Abstract

Chronic diseases often cause several medical complications. This paper aims to predict multiple complications among patients with a chronic disease. The literature uses single-task learning algorithms to predict complications independently and assumes no correlation among complications of chronic diseases. We propose two methods (independent prediction of complications with single-task learning and concurrent prediction of complications with multi-task learning) and show that medical complications of chronic diseases can be correlated. We use a case study and compare the performance of these two methods by predicting complications of hypertrophic cardiomyopathy on 106 predictors in 1078 electronic medical records from April 2009-April 2017, inclusive. The methods are implemented using logistic regression, artificial neural networks, decision trees, and support vector machines. The results show multi-task learning with logistic regression improves the performance of predictions in terms of both discrimination and calibration.

References

[1]
R. Amarasingham, R.E. Patzer, M. Huesch, N.Q. Nguyen, B. Xie, Implementing electronic health care predictive analytics: considerations and challenges, Health Aff. (Millwood) 33 (2014) 1148–1154.
[2]
I. Bardhan, J. Oh, Z. Zheng, K. Kirksey, Predictive analytics for readmission of patients with congestive heart failure, Inf. Syst. Res. 26 (2014) 19–39.
[3]
D.W. Bates, S. Saria, L. Ohno-Machado, A. Shah, G. Escobar, Big data in health care: using analytics to identify and manage high-risk and high-cost patients, Health Aff. (Millwood) 33 (2014) 1123–1131.
[4]
L. Chen, X. Li, Y. Yang, H. Kurniawati, Q.Z. Sheng, H.-Y. Hu, et al., Personal health indexing based on medical examinations: a data mining approach, Decis Support Syst 81 (2016) 54–65,.
[5]
A. Dag, A. Oztekin, A. Yucel, S. Bulur, F.M. Megahed, Predicting heart transplantation outcomes through data analytics, Decis Support Syst 94 (2017) 42–52.
[6]
D. Delen, A. Oztekin, L. Tomak, An analytic approach to better understanding and management of coronary surgeries, Decis Support Syst 52 (2012) 698–705,.
[7]
G. Meyer, G. Adomavicius, P.E. Johnson, M. Elidrisi, W.A. Rush, J.A.M. Sperl-Hillen, P.J. O’Connor, A machine learning approach to improving dynamic decision making, Inf. Syst. Res. 25 (2014) 239–263,.
[8]
S. Piri, D. Delen, T. Liu, H.M. Zolbanin, A data analytics approach to building a clinical decision support system for diabetic retinopathy: developing and deploying a model ensemble, Decis Support Syst 101 (2017) 12–27.
[9]
C.-J. Tseng, C.-J. Lu, C.-C. Chang, G.-D. Chen, C. Cheewakriangkrai, Integration of data mining classification techniques and ensemble learning to identify risk factors and diagnose ovarian cancer recurrence, Artif Intell Med 78 (2017) 47–54,.
[10]
A. Wulff, B. Haarbrandt, E. Tute, M. Marschollek, P. Beerbaum, T. Jack, An interoperable clinical decision-support system for early detection of SIRS in pediatric intensive care using openEHR, Artif Intell Med (2018),.
[11]
C.L. Brown, B.G. Hammill, L.G. Qualls, L.H. Curtis, A.J. Muir, Significant morbidity and mortality among hospitalized end-stage liver disease patients in Medicare, J Pain Symptom Manage 52 (2016) 412–419. e1.
[12]
M.D. Abràmoff, Y. Lou, A. Erginay, W. Clarida, R. Amelon, J.C. Folk, et al., Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning, Invest Ophthalmol Vis Sci 57 (2016) 5200–5206.
[13]
E. Choi, A. Schuetz, W.F. Stewart, J. Sun, Using recurrent neural network models for early detection of heart failure onset, J Am Med Inform Assoc 112 (2016),.
[14]
A. Dagliati, S. Marini, L. Sacchi, G. Cogni, M. Teliti, V. Tibollo, et al., Machine learning methods to predict diabetes complications, J Diabetes Sci Technol (2017).
[15]
P.B. Jensen, L.J. Jensen, S. Brunak, Mining electronic health records: towards better research applications and clinical care, Nat Rev Genet 13 (2012) 395.
[16]
R. Kohli, S.S.-L. Tan, Electronic Health Records: How Can IS Researchers Contribute to Transforming Healthcare?, MIS Q 40 (2016) 553–573.
[17]
B. López, C. Martin, P.H. Viñas, Special section on artificial intelligence for diabetes, Artif Intell Med 85 (2018) 26–27,.
[18]
K. Park, A. Ali, D. Kim, Y. An, M. Kim, H. Shin, Robust predictive model for evaluating breast cancer survivability, Eng Appl Artif Intell 26 (2013) 2194–2205.
[19]
R. Stoean, C. Stoean, M. Lupsor, H. Stefanescu, R. Badea, Evolutionary-driven support vector machines for determining the degree of liver fibrosis in chronic hepatitis C, Artif Intell Med 51 (2011) 53–65,.
[20]
Y.P. Tabak, X. Sun, C.M. Nunez, R.S. Johannes, Using electronic health record data to develop inpatient mortality predictive model: acute Laboratory Risk of Mortality Score (ALaRMS), J Am Med Inform Assoc 21 (2013) 455–463.
[21]
J.-Y. Yeh, T.-H. Wu, C.-W. Tsao, Using data mining techniques to predict hospitalization of hemodialysis patients, Decis Support Syst 50 (2011) 439–448,.
[22]
M. Sangi, K.T. Win, F. Shirvani, M.-R. Namazi-Rad, N. Shukla, Applying a novel combination of techniques to develop a predictive model for diabetes complications, PLoS One 10 (2015),.
[23]
B.J. Maron, Hypertrophic cardiomyopathy: a systematic review, JAMA 287 (2002) 1308–1320,.
[24]
S.B. Kotsiantis, I.D. Zaharakis, P.E. Pintelas, Machine learning: a review of classification and combining techniques, Artif Intell Rev 26 (2006) 159–190.
[25]
D. Bardou, K. Zhang, S.M. Ahmad, Lung sounds classification using convolutional neural networks, Artif Intell Med 88 (2018) 58–69,.
[26]
S. Kang, Personalized prediction of drug efficacy for diabetes treatment via patient-level sequential modeling with neural networks, Artif Intell Med 85 (2018) 1–6,.
[27]
Z. Liang, G. Zhang, J.X. Huang, Q.V. Hu, Deep Learning for Healthcare Decision Making With EMRs, in: Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference On, 2014, pp. 556–559.
[28]
R. Miotto, F. Wang, S. Wang, X. Jiang, J.T. Dudley, Deep learning for healthcare: review, opportunities and challenges, Brief. Bioinform (2017).
[29]
D. Ravì, C. Wong, F. Deligianni, M. Berthelot, J. Andreu-Perez, B. Lo, et al., Deep learning for health informatics, IEEE J Biomed Health Inform 21 (2017) 4–21.
[30]
V. Schetinin, L. Jakaite, W. Krzanowski, Bayesian averaging over Decision Tree models for trauma severity scoring, Artif Intell Med 84 (2018) 139–145,.
[31]
B. Shickel, P.J. Tighe, A. Bihorac, P. Rashidi, Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis, IEEE J Biomed Health Inform (2017).
[32]
A. Suner, C.C. Çelikoğlu, O. Dicle, S. Sökmen, Sequential decision tree using the analytic hierarchy process for decision support in rectal cancer, Artif Intell Med 56 (2012) 59–68,.
[33]
P. Nguyen, T. Tran, N. Wickramasinghe, S. Venkatesh, Deepr: a convolutional net for medical records, IEEE J Biomed Health Inform 21 (2017) 22–30.
[34]
T. Zheng, W. Xie, L. Xu, X. He, Y. Zhang, M. You, et al., A machine learning-based framework to identify type 2 diabetes through electronic health records, Int J Media Inf Lit 97 (2017) 120–127,.
[35]
S. Walczak, V. Velanovich, An evaluation of artificial neural networks in predicting pancreatic Cancer survival, J Gastrointest Surg 21 (2017) 1606–1612.
[36]
H.M. Zolbanin, D. Delen, A. Hassan Zadeh, Predicting overall survivability in comorbidity of cancers: a data mining approach, Decis Support Syst 74 (2015) 150–161,.
[37]
D.S.W. Ting, C.Y.-L. Cheung, G. Lim, G.S.W. Tan, N.D. Quang, A. Gan, H. Hamzah, R. Garcia-Franco, I.Y. San Yeo, S.Y. Lee, Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes, Jama 318 (2017) 2211–2223.
[38]
V. Kothari, R.J. Stevens, A.I. Adler, I.M. Stratton, S.E. Manley, H.A. Neil, R.R. Holman, UKPDS 60: risk of stroke in type 2 diabetes estimated by the UK Prospective Diabetes Study risk engine, Stroke 33 (2002) 1776–1781.
[39]
E. Pahl, L.A. Sleeper, C.E. Canter, D.T. Hsu, M. Lu, S.A. Webber, et al., Incidence of and risk factors for sudden cardiac death in children with dilated cardiomyopathy: a report from the pediatric cardiomyopathy registry, J Am Coll Cardiol 59 (2012) 607–615,.
[40]
American Heart Association, Hypertrophic cardiomyopathy [WWW document], URL http://www.heart.org/HEARTORG/Conditions/More/Cardiomyopathy/Hypertrophic-Cardiomyopathy_UCM_444317_Article.jsp#.WoczfKjwbD4 (accessed 2.16.18) 2016.
[41]
R. Caruana, Multitask learning, in: learning to learn, Springer, Boston, MA, 1998, pp. 95–133,.
[42]
M. Tan, Prediction of anti-cancer drug response by kernelized multi-task learning, Artif Intell Med 73 (2016) 70–77,.
[43]
D. Zhou, L. Miao, Y. He, Position-aware deep multi-task learning for drug–drug interaction extraction, Artif Intell Med 87 (2018) 1–8,.
[44]
N. Tangri, L.A. Stevens, J. Griffith, H. Tighiouart, O. Djurdjev, D. Naimark, A. Levin, A.S. Levey, A predictive model for progression of chronic kidney disease to kidney failure, Jama 305 (2011) 1553–1559.
[45]
D. Zhang, D. Shen, Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in alzheimer’s disease, NeuroImage 59 (2012) 895–907,.
[46]
N. Razavian, J. Marcus, D. Sontag, Multi-task prediction of disease onsets from longitudinal laboratory tests, in: machine learning for healthcare Conference, Presented at the Machine Learning for Healthcare Conference (2016) 73–100.
[47]
H.H.-C. Chuang, Mathematical modeling and Bayesian estimation for error-prone retail shelf audits, Decis Support Syst 80 (2015) 72–82,.
[48]
B. Heinrich, M. Klier, A. Schiller, G. Wagner, Assessing data quality – a probability-based metric for semantic consistency, Decis Support Syst 110 (2018) 95–106,.
[49]
C. Liu, A. Talaei-Khoei, D. Zowghi, J. Daniel, Data completeness in healthcare: a literature survey, Pac. Asia J. Assoc. Inf. Syst. 9 (2017).
[50]
R.K. Ando, T. Zhang, A framework for learning predictive structures from multiple tasks and unlabeled data, J Mach Learn Res 6 (2005) 1817–1853.
[51]
B. Bakker, T. Heskes, Task clustering and gating for bayesian multitask learning, J Mach Learn Res 4 (2003) 83–99.
[52]
J. Baxter, A model of inductive bias learning, J Artif Intell Res 12 (2000) 149–198.
[53]
A.I. Namburete, W. Xie, M. Yaqub, A. Zisserman, J.A. Noble, Fully-automated alignment of 3D fetal brain ultrasound to a canonical reference space using multi-task learning, Med Image Anal (2018).
[54]
R. Ranjan, V.M. Patel, R. Chellappa, Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition, IEEE Trans Pattern Anal Mach Intell (2017).
[55]
J. Yu, B. Zhang, Z. Kuang, D. Lin, J. Fan, iPrivacy: image privacy protection by identifying sensitive objects via deep multi-task learning, IEEE Trans. Inf. Forensics Secur. 12 (2017) 1005–1016.
[56]
J. Baxter, A model of inductive bias learning, J Artif Intell ResJAIR 12 (2000) 3.
[57]
Y. Liu, C. Jiang, H. Zhao, Using contextual features and multi-view ensemble learning in product defect identification from online discussion forums, Decis Support Syst 105 (2018) 1–12,.
[58]
A. Gelman, A. Jakulin, M.G. Pittau, Y.-S. Su, A weakly informative default prior distribution for logistic and other regression models, Ann Appl Stat 2 (2008) 1360–1383.
[59]
S.R. Jammalamadaka, J. Qiu, N. Ning, Multivariate bayesian structural time series model, ArXiv Prepr (2018) ArXiv180103222.
[60]
L. Melie-Garcia, B. Draganski, J. Ashburner, F. Kherif, Multiple linear regression: bayesian inference for distributed and big data in the medical informatics platform of the human brain project, bioRxiv (2018).
[61]
S. Gribling, D. de Laat, M. Laurent, Matrices with high completely positive semidefinite rank, Linear Algebra Its Appl. 513 (2017) 122–148.
[62]
L. Follett, C. Yu, Achieving parsimony in bayesian VARs with the horseshoe prior, ArXiv Prepr (2017) ArXiv170907524.
[63]
D. Lewandowski, D. Kurowicka, H. Joe, Generating random correlation matrices based on vines and extended onion method, J Multivar Anal 100 (2009) 1989–2001.
[64]
E.W. Steyerberg, G.J. Borsboom, H.C. van Houwelingen, M.J. Eijkemans, J.D.F. Habbema, Validation and updating of predictive logistic regression models: a study on sample size and shrinkage, Stat Med 23 (2004) 2567–2586.
[65]
C. De Mol, D. Giannone, L. Reichlin, Forecasting using a large number of predictors: Is Bayesian shrinkage a valid alternative to principal components?, J Econom 146 (2008) 318–328.
[66]
S.K. Lee, On generalized multivariate decision tree by using GEE, Comput Stat Data Anal 49 (2005) 1105–1119.
[67]
P. Shih, C. Liu, Face detection using discriminating feature analysis and support vector machine, Pattern Recognit 39 (2006) 260–276.
[68]
F. Ahmadzadeh, Change point detection with multivariate control charts by artificial neural network, Int. J. Adv. Manuf. Technol. (2009) 1–12.
[69]
T. Liu, D. Tao, M. Song, S.J. Maybank, Algorithm-dependent generalization bounds for multi-task learning, IEEE Trans Pattern Anal Mach Intell 39 (2017) 227–241.
[70]
P.J. García-Laencina, P.H. Abreu, M.H. Abreu, N. Afonoso, Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values, Comput Biol Med 59 (2015) 125–133,.
[71]
N. Shukla, M. Hagenbuchner, K.T. Win, J. Yang, Breast cancer data analysis for survivability studies and prediction, Comput Methods Programs Biomed 155 (2018) 199–208,.
[72]
P. Cunningham, S.J. Delany, k-Nearest neighbour classifiers, Mult. Classif. Syst. 34 (2007) 1–17.
[73]
A. Kusiak, B. Dixon, S. Shah, Predicting survival time for kidney dialysis patients: a data mining approach, Comput Biol Med 35 (2005) 311–327.
[74]
SCAO, National Clinical Terminology Service (NCTS) website [WWW document], URL https://www.healthterminologies.gov.au/ (accessed 12.29.17) 2016.
[75]
M. Sariyar, A. Borg, K. Pommerening, Missing values in deduplication of electronic patient data, J Am Med Inform Assoc 19 (2011) e76–e82.
[76]
C. Ferri, J. Hernández-Orallo, R. Modroiu, An experimental comparison of performance measures for classification, Pattern Recognit Lett 30 (2009) 27–38,.
[77]
M.H. Zweig, G. Campbell, Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine, Clin Chem 39 (1993) 561–577.
[78]
A. Dag, K. Topuz, A. Oztekin, S. Bulur, F.M. Megahed, A probabilistic data-driven framework for scoring the preoperative recipient-donor heart transplant survival, Decis Support Syst 86 (2016) 1–12.
[79]
M. Ghil, P. Yiou, S. Hallegatte, B.D. Malamud, P. Naveau, A. Soloviev, et al., Extreme events: dynamics, statistics and prediction, Nonlinear Process Geophys 18 (2011) 295–350.
[80]
E. Demir, A decision support tool for predicting patients at risk of readmission: a comparison of classification trees, logistic regression, generalized additive models, and multivariate adaptive regression splines, Decis. Sci. 45 (2014) 849–880.
[81]
M. Ivanović, Z. Budimac, An overview of ontologies and data resources in medical domains, Expert Syst Appl 41 (2014) 5158–5166,.
[82]
M.F. McGuire, Pancreatic Cancer: insights from counterterrorism theories, Decis Anal 11 (2014) 265–276,.
[83]
F. Zandi, A bi-level interactive decision support framework to identify data mining-oriented electronic health record architectures, Appl Soft Comput 18 (2014) 136–145,.
[84]
S.-T. Liaw, A. Rahimi, P. Ray, J. Taggart, S. Dennis, S. de Lusignan, et al., Towards an ontology for data quality in integrated chronic disease management: a realist review of the literature, Int J Media Inf Lit 82 (2013) 10–24.
[85]
M.-L. Zhang, Z.-H. Zhou, A review on multi-label learning algorithms, IEEE Trans Knowl Data Eng 26 (2014) 1819–1837.
[86]
L. Chen, D.J. Magliano, P.Z. Zimmet, The worldwide epidemiology of type 2 diabetes mellitus—present and future perspectives, Nat Rev Endocrinol 8 (2012) 228–236.
[87]
M.G. White, J.A.M. Shaw, R. Taylor, Type 2 diabetes: the pathologic basis of reversible β-Cell dysfunction, Diabetes Care 39 (2016) 2080–2088,.
[88]
Y.-K. Lin, H. Chen, R.A. Brown, S.-H. Li, H.-J. Yang, Healthcare predictive analytics for risk profiling in chronic care: a Bayesian multitask learning approach, MIS Q (2017) 41.
[89]
B. Liu, Y. Li, S. Ghosh, Z. Sun, K. Ng, J. Hu, Complication risk profiling in diabetes care: a bayesian multi-task and feature relationship learning approach, IEEE Trans Knowl Data Eng (2019).
[90]
A. Argyriou, T. Evgeniou, M. Pontil, Convex multi-task feature learning, Mach Learn 73 (2008) 243–272.
[91]
Y. Zhang, D.-Y. Yeung, A convex formulation for learning task relationships in multi-task learning, ArXiv Prepr (2012).

Index Terms

  1. A predictive analytics framework for identifying patients at risk of developing multiple medical complications caused by chronic diseases
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image Artificial Intelligence in Medicine
            Artificial Intelligence in Medicine  Volume 101, Issue C
            Nov 2019
            131 pages

            Publisher

            Elsevier Science Publishers Ltd.

            United Kingdom

            Publication History

            Published: 01 November 2019

            Author Tags

            1. Predictive analytics
            2. Chronic disease
            3. Artificial neural networks
            4. Multi-Task learning
            5. Regression.

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 21 Jan 2025

            Other Metrics

            Citations

            View Options

            View options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media