Abstract
In recent years the use of fuzzy clustering techniques in medical diagnosis is increasing steadily, because of the effectiveness of fuzzy clustering techniques in recognizing the systems in the medical database to help medical experts in diagnosing diseases. This study focuses on clustering lung cancer dataset into three types of cancers which are leading cause of cancer death in the world. This paper invents effective fuzzy clustering techniques by incorporating hyper tangent kernel function, and entropy methods for analyzing the Lung Cancer database to assist physician in diagnosing lung cancer. Further this paper proposes an algorithm to initialize the cluster centers to speed up the process of the algorithms. The effectiveness of the proposed methods has been proved through the experimental works on synthetic dataset, Wine dataset and IRIS dataset in terms of running time, number of iterations, visual segmentation effects and clustering accuracy. And then this paper proposes the proposed method on Lung cancer database to divide it into three types of lung cancers. In addition this paper proves the superiority of the proposed methods by comparing the obtained classes with reference classes through Error Matrix.
Similar content being viewed by others
References
Abonyi, J., Szeifert, F.: Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recognit. Lett. 24(14), 2195–2207 (2003)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)
Hassanien, A.E.: Rough set approach for attribute reduction and rule generation: a case of patients with suspected breast cancer. J. Am. Soc. Inf. Sci. Technol. 55(11), 954–962 (2004)
Chen, H.-L., Yang, B., Liu, J., Liu, D.-Y.: A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis. Expert Syst. Appl. 38, 9014–9022 (2011)
Kannan, S.R., Ramathilagam, S.: Fuzzy error matrix in classification techniques. Int. J. Appl. Math. Inform. 26(1–5), 861–876 (2008). ISSN: 1598-5857
Kanzawa, Y., Endo, Y., Miyamoto, S.: Fuzzy classification function of entropy regularized fuzzy c-means algorithm for data with tolerance using kernel function. In: Granular Computing (GrC 2008), pp. 350–355 (2008) IEEE Xplore
Maglogiannis, I., Zafiropoulos, E., et al.: An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers. Appl. Intell. 30(1), 24–36 (2009)
Parkin, D.M., Bray, F., Ferlay, J., Pisani, P.: Global cancer statistics. CA Cancer J Clin 55(2), 74–108 (2002)
Pena-Reyes, C.A., Sipper, M.: A fuzzy-genetic approach to breast cancer diagnosis. Artif. Intell. Med. 17(2), 131–155 (1999)
Polat, K., Gunes, S.: Breast cancer diagnosis using least square support vector machine. Digit. Signal Process. 17(4), 694–701 (2007)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Sahan, S., Polat, K., et al.: A new hybrid method based on fuzzy-artificial immune system and k-nn algorithm for breast cancer diagnosis. Comput. Biol. Med. 37(3), 415–423 (2007)
Setiono, R.: Generating concise and accurate classification rules for breast cancer diagnosis. Artif. Intell. Med. 18(3), 205–219 (2000)
Hawes, S.E., Stern, J.E., Feng, Q., Wiens, L.W., Rasey, Janet S., Lu, H., Kiviat, N.B., Vesselle, H.: DNA hypermethylation of tumors from non-small cell lung cancer (NSCLC) patients is associated with gender and histologic type. Lung Cancer 69(2010), 172–179 (2010)
Tamer, A.M., Karahan, H.X., Aral, M.M.: Aquifer parameter and zone structure estimation using kernel-based fuzzy c-means clustering and genetic algorithm. J. Hydrol. 343, 240–253 (2007)
Ubeyli, E.D.: Implementing automated diagnostic systems for breast cancer detection. Expert Syst. Appl. 33(4), 1054–1062 (2007)
UCI Benchmark repository: a huge collection of artificial and real world data sets, University of California Irvine. http://www.ics.uci.edu/~mlearn
Zhang, D.Q., Chen, S.C.: Clustering incomplete data using kernel-based fuzzy C-means algorithm. Neural Process. Lett. 18(3), 155–162 (2003)
Acknowledgements
This work was financially support by UGC MRP, India (Ref. No. 39-35/2010(SR)).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ramathilagam, S., Devi, R. & Kannan, S.R. Extended fuzzy c-means: an analyzing data clustering problems. Cluster Comput 16, 389–406 (2013). https://doi.org/10.1007/s10586-012-0202-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-012-0202-2