Recently, the use of artificial intelligence based data mining techniques for massive medical data classification and diagnosis has gained its popularity, whereas the effectiveness and efficiency by feature selection is worthy to further investigate. In this paper, we presents a novel method for feature selection with the use of opposite sign test (OST) as a local search for the electromagnetism-like mechanism (EM) algorithm, denoted as improved electromagnetism-like mechanism (IEM) algorithm. Nearest neighbor algorithm is served as a classifier for the wrapper method. The proposed IEM algorithm is compared with nine popular feature selection and classification methods. Forty-six datasets from the UCI repository and eight gene expression microarray datasets are collected for comprehensive evaluation. Non-parametric statistical tests are conducted to justify the performance of the methods in terms of classification accuracy and Kappa index. The results confirm that the proposed IEM method is superior to the common state-of-art methods. Furthermore, we apply IEM to predict the occurrence of Type 2 diabetes mellitus (DM) after a gestational DM. Our research helps identify the risk factors for this disease; accordingly accurate diagnosis and prognosis can be achieved to reduce the morbidity and mortality rate caused by DM.
Keywords: Diabetes mellitus; Electromagnetism-like mechanism algorithm; Feature selection; Nearest-neighbor heuristic; Opposite sign test.
Copyright © 2015 Elsevier Inc. All rights reserved.