Abstract
Chronic Kidney Disease (CKD) is currently a worldwide chronic disease with an increasing incidence, prevalence and high cost to health systems. A delayed recognition and prevention often lead to a premature mortality due to progressive and incurable loss of kidney function. Data mining classifiers employment to discover patterns in CKD indicators would contribute to an early diagnosis that allow patients to prevent such kidney severe damage. Adopting the cross Industry Standard Process of Data Mining (CRISP-DM) methodology, this work develops a classifier model that would support healthcare professionals in early diagnosis of CKD patients. By building a data pipeline that manages the different phases of CRISP-DM, an automated data transformation, modelling and evaluation is applied to the CKD dataset extracted from the UCI ML repository. Moreover, the pipeline along with the Scikit-learn package’s GridSearchCV is used to carry out an exhaustive search of the best data mining classifier and the different parameters of the data preparation’s sub-stages like data missing and feature selection. Thus, AdaBoost is selected as the best classifier and it outperforms with a 100% in terms of accuracy, precision, sensivity, specificity, f1-score and roc auc, the classification results obtained by the related works reviewed. Moreover, the application of feature selection reduces up to 12 out of 24 features which are employed in the classifier model developed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bikbov, B., et al.: Global, regional, and national burden of chronic kidney disease, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 395(10225), 709–733 (2020). https://doi.org/10.1016/S0140-6736(20)30045-3
Chen, Z., Zhang, X., Zhang, Z.: Clinical risk assessment of patients with chronic kidney disease by using clinical data and multivariate models. Int. Urol. Nephrol. 48(12), 2069–2075 (2016). https://doi.org/10.1007/s11255-016-1346-4
Keith, D.S., Nichols, G.A., Gullion, C.M., Brown, J.B., Smith, D.H.: Longitudinal follow-up and outcomes among a population with chronic kidney disease in a large managed care organization. Arch. Intern. Med. 164(6), 659–663 (2004). https://doi.org/10.1001/archinte.164.6.659
Levin, A., et al.: Prevalence of abnormal serum vitamin D, PTH, calcium, and phosphorus in patients with chronic kidney disease: results of the study to evaluate early kidney disease. Kidney Int. 71(1), 31–38 (2007). https://doi.org/10.1038/sj.ki.5002009
Liao, M.-T., Sung, C.-C., Hung, K.-C., Wu, C.-C., Lo, L., Lu, K.-C.: Insulin resistance in patients with chronic kidney disease. J. Biomed. Biotechnol. 2012, 1–12 (2012). https://www.hindawi.com/journals/bmri/2012/691369/. Accessed 05 Aug 2020
Perazella, M.A., Reilly, R.F.: Chronic kidney disease: a new classification and staging system. Hosp. Phys. 39(3), 18–22 (2003)
Salekin, A., Stankovic, J.: Detection of chronic kidney disease and selecting important predictive attributes. In: 2016 IEEE International Conference on Healthcare Informatics (ICHI), pp. 262–270, October 2016. https://doi.org/10.1109/ICHI.2016.36
Jeewantha, R.A., Halgamuge, M.N., Mohammad, A., Ekici, G.: Classification performance analysis in medical science: using kidney disease data. In: Proceedings of the 2017 International Conference on Big Data Research, Osaka, Japan, pp. 1–6, October 2017. https://doi.org/10.1145/3152723.3152724
Kumar, K., Abhishek, B.: Artificial Neural Networks for Diagnosis of Kidney Stones Disease. GRIN Verlag, Germany (2012)
Kunwar, V., Chandel, K., Sabitha, A.S., Bansal, A.: Chronic kidney disease analysis using data mining classification techniques. In: 2016 6th International Conference - Cloud System and Big Data Engineering (Confluence), pp. 300–305, January 2016. https://doi.org/10.1109/CONFLUENCE.2016.7508132
Imran, A.A., Amin, M.N., Johora, F.T.: Classification of chronic kidney disease using logistic regression, feedforward neural network and wide deep learning. In: 2018 International Conference on Innovation in Engineering and Technology (ICIET), pp. 1–6, December 2018. https://doi.org/10.1109/CIET.2018.8660844
Dhamodharan, S.: Liver disease prediction using Bayesian classification. Int. J. Sci. Eng. Technol. Res. 4, 3 (2014)
Chiu, R.K., Chen, R.Y., Wang, S.-A., Jian, S.-J.: Intelligent systems on the cloud for the early detection of chronic kidney disease. In: 2012 International Conference on Machine Learning and Cybernetics, vol. 5, pp. 1737–1742, July 2012. https://doi.org/10.1109/ICMLC.2012.6359637
Baby, P.S., Vital, T.P.: Statistical analysis and predicting kidney diseases using machine learning algorithms. Int. J. Eng. Res. Technol. 4(7), 206–210 (2015)
Lakshmi, K., Nagesh, Y., Krishna, M.V.: Performance comparison of three data mining techniques for predicting kidney dialysis survivability. Int. J. Adv. Eng. Technol. 7(1), 242 (2014)
Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences (2017)
Rubini, L.J., Eswaran, P.: Generating comparative analysis of early stage prediction of Chronic Kidney Disease. Int. J. Mod. Eng. Res. (IJMER) 5(7), 49–55 (2015)
Ani, R., Sasi, G., Sankar, U.R., Deepa, O.S.: Decision support system for diagnosis and prediction of chronic renal failure using random subspace classification. In: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1287–1292, September 2016. https://doi.org/10.1109/ICACCI.2016.7732224
Charleonnan, A., Fufaung, T., Niyomwong, T., Chokchueypattanakit, W., Suwannawach, S., Ninchawee, N.: Predictive analytics for chronic kidney disease using machine learning techniques. In: 2016 Management and Innovation Technology International Conference (MITicon), pp. MIT-80–MIT-83, October 2016. https://doi.org/10.1109/MITICON.2016.8025242.
Eyck, J.V., et al.: Prediction of chronic kidney disease using random forest machine learning algorithm (2016). https://www.paper/Prediction-of-Chronic-Kidney-Disease-Using-Random-Eyck-Zadeh/c8f5ed96b924f00c729a1a3ff79ead91a8418dc7. Accessed 30 July 2020
Chetty, N., Vaisla, K.S., Sudarsan, S.D.: Role of attributes selection in classification of chronic kidney disease patients. In: 2015 International Conference on Computing, Communication and Security (ICCCS), pp. 1–6, December 2015. https://doi.org/10.1109/CCCS.2015.7374193
MohammedSiyad, B., Manoj, M.: Fused features classification for the effective prediction of chronic kidney disease. Int. J. 2, 44–48 (2016)
Basar, M.D., Akan, A.: Detection of chronic kidney disease by using ensemble classifiers. In: 2017 10th International Conference on Electrical and Electronics Engineering (ELECO), pp. 544–547, November 2017
Wibawa, M.S., Maysanjaya, I.M.D., Putra, I.M.A.W.: Boosted classifier and features selection for enhancing chronic kidney disease diagnose. In: 2017 5th International Conference on Cyber and IT Service Management (CITSM), pp. 1–6, August 2017. https://doi.org/10.1109/CITSM.2017.8089245
Zubair Hasan, K.M., Zahid Hasan, M.: Performance evaluation of ensemble-based machine learning techniques for prediction of chronic kidney disease. In: Shetty, N.R., Patnaik, L.M., Nagaraj, H.C., Hamsavath, P.N., Nalini, N. (eds.) Emerging Research in Computing, Information, Communication and Applications. AISC, vol. 882, pp. 415–426. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-5953-8_34
Wirth, R., Hipp, J.: CRISP-DM: towards a standard process model for data mining, p. 11 (2000)
Fushiki, T.: Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 21(2), 137–146 (2011). https://doi.org/10.1007/s11222-009-9153-8
Oliphant, T.E.: Python for scientific computing. Comput. Sci. Eng. 9(3), 10–20 (2007). https://doi.org/10.1109/MCSE.2007.58
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Moreno-Sanchez, P.A. (2021). Chronic Kidney Disease Early Diagnosis Enhancing by Using Data Mining Classification and Features Selection. In: Goleva, R., Garcia, N.R.d.C., Pires, I.M. (eds) IoT Technologies for HealthCare. HealthyIoT 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 360. Springer, Cham. https://doi.org/10.1007/978-3-030-69963-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-69963-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69962-8
Online ISBN: 978-3-030-69963-5
eBook Packages: Computer ScienceComputer Science (R0)