Abstract
Machine learning is used as an effective support system in health diagnosis which contains large volume of data. More commonly, analyzing such a large volume of data consumes more resources and execution time. In addition, all the features present in the dataset do not support in achieving the solution of the given problem. Hence, there is a need to use an effective feature selection algorithm for finding the more important features that contribute more in diagnosing the diseases. The Particle Swarm Optimization (PSO) is one of the metaheuristic algorithms to find the best solution with less time. Nowadays, PSO algorithm is not only used to select the more significant features but also removes the irrelevant and redundant features present in the dataset. However, the traditional PSO algorithm has an issue in selecting the optimal weight to update the velocity and position of the particles. To overcome this issue, this paper presents a novel function for identifying optimal weights on the basis of population diversity function and tuning function. We have also proposed a novel fitness function for PSO with the help of Support Vector Machine (SVM). The objective of the fitness function is to minimize the number of attributes and increase the accuracy. The performance of the proposed PSO-SVM is compared with the various existing feature selection algorithms such as Info gain, Chi-squared, One attribute based, Consistency subset, Relief, CFS, Filtered subset, Filtered attribute, Gain ratio and PSO algorithm. The SVM classifier is also compared with several classifiers such as Naive Bayes, Random forest and MLP.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.REFERENCES
Imran Kurt, Mevlut Ture, et al., Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease, J. Expert Syst. Appl., 2008, vol. 34, pp. 366–374.
Hongmei Yan, Jun Zheng, et al., Selecting critical clinical features for heart diseases diagnosis with a real-coded genetic algorithm, J. Appl. Soft Comput., 2008, vol. 8, pp. 1105–1111.
Carlos Ordonez, Association rule discover with the train and test approach for the heart disease prediction, IEEE Trans. Inf. Technol. Biomed., 2006, vol. 10, no. 2, pp. 334–343.
Kusiak, A., Caldarone, Ch.A., et al., Hypo plastic left heart syndrome knowledge discovery with a data mining approach, J. Comput. Biol. Med., 2006, vol. 36, no. 1, pp. 21–40.
Babaoglu, I., Kaan Baykan, O., et al., Assessment of exercise stress testing with artificial neural network in determining coronary artery disease and predicting lesion localization, J. Expert Syst. Appl., 2009, vol. 36, pp. 2562–2566.
Rajeswari, K., Vaithiyanathan, V., et al., Feature selection in ischemic heart disease identification using feed forward neural networks, Int. Symposium on Robotics and Intelligent Sensors, 2012, vol. 41, pp. 1818–1823.
Mu-Jung Huang, Mu-Yen Chen, et al., Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis, J. Expert Syst. Appl., 2007, vol. 32, pp. 856–867.
Tan, K.C., Teoh, E.J., et al., A hybrid evolutionary algorithm for attribute selection in data mining, J. Expert Syst. Appl., 2009, vol. 36, pp. 8616–8630.
Jesmin Nahar, Tasadduq Imam, et al., Association rule mining to detect factors which contribute to heart disease in males and females, J. Expert Syst. Appl., 2013, vol. 40, pp. 1086–1093.
Austin, P.C., Tu, J.V., et al., Using methods from the data-mining and machine-learning literature for disease classification and prediction: A case study examining classification of heart failure subtypes, J. Clin. Epidemiol., 2013, vol. 66, pp. 398–407.
Kemal Polat and Salih Gunes, A new feature selection method on classification of medical datasets: Kernel F-score feature selection, J. Expert Syst. Appl., 2009, vol. 36, pp. 10367–10373.
Babaoglu, I., Findik, O., et al., A comparison of feature selection models utilizing binary Particle Swarm Optimization and genetic algorithm in determining coronary artery disease using Support Vector Machine, J. Expert Syst. Appl., 2010, vol. 37, pp. 3177–3183.
Jesmin Nahar, Tasadduq Imam, et al., Computational intelligence for heart disease diagnosis: A medical knowledge driven approach, J. Expert Syst. Appl., 2013, vol. 40, pp. 96–104.
Setiawan, N.A. et al., A comparative study of imputation methods to predict missing attribute values in coronary heart disease data set, J. Dep. Electr. Electron. Eng., 2009, vol. 21, pp. 266–269.
Luukka, P. and Lampinen, J., A classification method based on Principal Component Analysis and differential evolution algorithm applied for prediction diagnosis from clinical EMR heart data sets, J. Comput. Intell. Optimization Adaption, Learn. Optim., 2010, vol. 7, pp. 263–283.
Das, R., Turkoglu, I., et al., Effective diagnosis of heart disease through neural networks ensembles, J. Expert Syst. Appl., 2009, vol. 36, pp. 7675–7680.
Das, R., Turkoglu, I., et al., Diagnosis of valvular heart disease through neural networks ensembles, J. Comput. Methods Programs Biomed., 2009, vol. 93, pp. 185–191.
Chang-Sik Son, Yoon-Nyun Kim, et al., Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches, J. Biomed. Inf., 2012, vol. 45, pp. 999–1008.
Laercio Brito Gonçalves, Marley Maria Bernardes Rebuzzi Vellasco, et al., Inverted hierarchical neuro-fuzzy BSP system: A novel neuro-fuzzy model for pattern classification and rule extraction in databases, J. IEEE Trans. Syst., Man, Cybernetics, 2006, vol. 36, no. 2.
Kemal Polat and Salih Gunes, A hybrid approach to medical decision support systems: Combining feature selection, fuzzy weighted pre-processing and AIRS, J. Comput. Methods Progr. Biomed., 2007, vol. 88, pp. 164–174.
Kemal Polat, Seral Sahan, et al., Automatic detection of heart disease using an Artificial Immune Recognition System (AIRS) with fuzzy resource allocation mechanism and k-nn (nearest neighbor) based weighting preprocessing, J. Expert Syst. Appl., 2007, vol. 32, pp. 625–631.
Akin Ozcift and Arif Gulten, Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms, J. Comput. Methods Progr. Biomed., 2011, vol. 104, pp. 443–451.
Chih-Lin Chi, Nick Street, W., et al., A decision support system for cost-effective diagnosis, J. Artif. Intell. Med., 2010, vol. 50, pp. 149–161.
Yoon-Joo Park, Se-Hak Chun, et al., Cost-sensitive case-based reasoning using a genetic algorithm: Application to medical diagnosis, J. Artif. Intell. Med., 2011, vol. 51, pp. 133–145.
Debabrata Pal, Mandana, K.M., et al., Fuzzy expert system approach for coronary artery disease screening using clinical parameters, J. Knowl. Based Syst., 2012, vol. 36, pp. 162–174.
Kahramanli, H. and Allahverdi, N., Design of a hybrid system for the diabetes and heart diseases, J. Expert Syst. Appl., 2008, vol. 35, pp. 82–89.
Vahid Khatibi and Gholam Ali Montazer, A fuzzy-evidential hybrid inference engine for coronary heart disease risk assessment, J. Expert Syst. Appl., 2010, vol. 37, pp. 8536–8542.
Goekmen Turan, R., Bozdag, I., et al., Improved functional activity of bone marrow derived circulating progenitor cells after intra coronary freshly isolated bone marrow cells transplantation in patients with ischemic heart disease, J. Stem Cell Rev. Rep., 2011, vol. 7, pp.646–656.
Karsdorp, P.A., Kindt, M., et al., False heart rate feedback and the perception of heart symptoms in patients with congenital heart disease and anxiety, Int. J. Behav. Med., 2009, vol. 16, pp. 81–88.
Carlosnasillo/Hybrid-Genetic-Algorithm, 2017. GitHub. https://github.com/carlosnasillo/Hybrid-Genetic-Algorithm. Retrieved October 22, 2017.
Muthukaruppan, S. and Er, M.J., A hybrid Particle Swarm Optimization based fuzzy expert system for the diagnosis of coronary artery disease, J. Expert Syst. Appl., 2012, vol. 39, pp. 11657–11665.
Anooj, P.K., Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules, J. Comput. Inf. Sci., 2012, vol. 24, pp. 27–40.
Tsipouras, M.G., Exarchos, T.P., et al., Automated diagnosis of coronary artery disease based on data mining and fuzzy modeling, J. IEEE Trans. Inf. Technol. Biomed., 2008, vol. 12, no. 4.
Paredesa, S. et al., Long term cardiovascular risk models’ combination, J. Comput. Methods Progr. Biomed., 2011, vol. 101, pp. 231–242.
Swati Shilaskar et al., Feature selection for medical diagnosis: Evaluation for cardiovascular diseases, J. Expert Syst. Appl., 2013, vol. 40, pp. 4146–4153.
UCI Machine Learning Repository: Heart Disease Data Set. Archive.ics.uci.edu. http://archive.ics.uci. edu/ml/datasets/Heart+Disease. Retrieved October 22, 2017.
Zhao, M., Fu, C., Ji, L., Tang, K., and Zhou, M., Feature selection and parameter optimization for Support Vector Machines: A new approach based on genetic algorithm with feature chromosomes, Expert Syst. Appl., 2011, vol. 38, no. 5, pp. 5197–5204.
Li-Na Pu, Ze Zhao, et al., Investigation on cardiovascular risk prediction using genetic information, J. IEEE Trans. Inf. Technol., Biomed., 2012, vol. 16, no. 5.
Pfister, R., Barnes, D., et al., Individual and cumulative effect of type 2 diabetes genetic susceptibility variants on risk of coronary heart disease, J. Diabetologia, 2011, vol. 54, pp. 2283–2287.
Nazri Mohd Nawi, Rozaida Ghazali, et al., The development of improved back-propagation neural networks algorithm for predicting patients with heart disease, in Proceedings of the First International Conference ICICA, 2010, vol. 6377, pp. 317–324.
Jae-Hong Eom, Sung-Chun Kim, et al., AptaCDSS-E: A classifier ensemble-based clinical decision support system for cardiovascular disease level prediction, J. Expert Syst. Appl., 2008, vol. 34 2465, p. 2479.
Iftikhar, S., Fatima, K., Rehman, A., Almazyad, A.S., and Saba, T., An evolution based hybrid approach for heart diseases classification and associated risk factors identification, Biomed. Res., 2017, vol. 28, no. 8.
Shah, S.M.S., Batool, S., Khan, I., Ashraf, M.U., Abbas, S.H., and Hussain, S.A., Feature extraction through parallel probabilistic Principal Component Analysis for heart disease diagnosis, Phys. A: Statistical Mechanics and Its Applications, 2017, vol. 482, pp. 796–807.
Arabasadi, Z., Alizadehsani, R., Roshanzamir, M., Moosaei, H., and Yarifard, A.A., Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm, Comput. Methods Progr. Biomed., 2017, vol. 141, pp. 19–26.
Li, Q., Chen, H., Huang, H., Zhao, X., Cai, Z., Tong, C., and Tian, X., An enhanced grey wolf optimization based feature selection wrapped kernel extreme learning machine for medical diagnosis, Comput. Math. Methods Med., 2017.
Vivekanandan, T. and Iyengar, N.C.S.N., Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease, Comput. Biol. Med., 2017, vol. 90, pp. 125–136.
Jabbar, M.A., Deekshatulu, B.L., and Chandra, P., Prediction of heart disease using random forest and feature subset selection, in Innovations in Bio-Inspired Computing and Applications, Cham.; Springer, 2016, pp. 187–196.
Paul, A.K., Shill, P.C., Rabin, M.R.I., and Akhand, M.A.H., Genetic algorithm based fuzzy decision support system for the diagnosis of heart disease, in Informatics, Electronics and Vision (ICIEV), 2016 5th International Conference, IEEE, 2016, pp. 145–150.
Inbarani, H.H., Azar, A.T., and Jothi, G., Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis, Comput. Methods Progr. Biomed., 2014, vol. 113, no. 1, pp. 175–185.
Tomar, D. and Agarwal, S., Feature selection based Least Square Twin Support Vector Machine for diagnosis of heart disease, Int. J. Bio-Sci. Bio-Technol., 2014, vol. 6, no. 2, pp. 69–82.
Reddy, G.T. and Khare, N., An efficient system for heart disease prediction using hybrid OFBAT with rule-based Fuzzy Logic Model, J. Circuits, Syst. Comput., 2017, vol. 26, no. 04, p. 1750061.
Pimentel, A., Coronary heart disease prognosis using machine-learning techniques on patients with type 2 Diabetes Mellitus, in Ubiquitous Machine Learning and Its Applications, IGI Global, 2017, pp. 89–112.
Author information
Authors and Affiliations
Corresponding author
Additional information
The article is published in the original.
Rights and permissions
About this article
Cite this article
Vijayashree, J., Sultana, H.P. A Machine Learning Framework for Feature Selection in Heart Disease Classification Using Improved Particle Swarm Optimization with Support Vector Machine Classifier. Program Comput Soft 44, 388–397 (2018). https://doi.org/10.1134/S0361768818060129
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0361768818060129