Abstract
Clustering provides a common means of identifying structure in complex data, and there is renewed interest in clustering as a tool for the analysis of large data sets in many fields. A fundamental and difficult problem in cluster analysis is how many clusters are appropriate for the description of a given system. The objective of this paper is to develop a method for automatically determining the number of clusters. The method firstly proposes a new clustering validity evaluation function based on the extended decision-theoretic rough set model. Then a hierarchical clustering algorithm is proposed and some conclusions are obtained in the validation of the algorithm. Experimental results show that the new clustering method can stop at the perfect number of clusters automatically and validate the change laws of the clustering validity evaluation function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Chen, L.F., Jiang, Q.S., Wang, S.R.: A Hierarchical Method for Determining the Number of Clusters. Journal of Software 19(1), 62–72 (2008) (in Chinese)
Gordon, A.D.: Classification. Chapman & Hall/CRC, Lundon (1999)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Clustering validity checking methods: part II. ACM SIGMOD Record Archive 31(3), 19–27 (2002)
Herbert, J.P., Yao, J.T.: Criteria for Choosing a Rough Set Model. Journal of Computers and Mathematics with Applications 57(6), 908–918 (2009)
Kapp, A.V., Tibshirani, R.: Are clusters found in one dataset present in another dataset? Biostatistics 8(1), 9 (2007)
Lingras, P., Chen, M., Miao, D.Q.: Rough cluster quality index based on decision theory. IEEE Transactions on Knowledge and Data Engineering 21(7), 1014–1026 (2009)
Liu, D., Yao, Y.Y., Li, T.R.: Three-way investment decisions with decision-theoretic rough sets. International Journal of Computational Intelligence Systems 4(1), 66–74 (2011)
Şerban, G., Câmpan, A.: Hierarchical adaptive clustering. Informatica 19(1), 101–112 (2008)
Still, S., Bialek, W.: How many clusters? an information-theoretic perspective. Neural Computation 16(12), 2483–2506 (2004)
Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximating concepts. International Journal of Man-Machine Studies 37(6), 793–809 (1992)
Yao, Y.Y.: Three-way decisions with probabilistic rough sets. Information Sciences 180(3), 341–353 (2010)
UCIrvine Machine Learning Repository, http://archive.ics.uci.edu/ml/
Zhou, X.Z., Li, H.X.: A multi-view decision model based on decision-theoretic rough set. In: Wen, P., Li, Y., Polkowski, L., Yao, Y., Tsumoto, S., Wang, G. (eds.) RSKT 2009. LNCS, vol. 5589, pp. 650–657. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, H., Liu, Z., Wang, G. (2011). Automatically Determining the Number of Clusters Using Decision-Theoretic Rough Set. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds) Rough Sets and Knowledge Technology. RSKT 2011. Lecture Notes in Computer Science(), vol 6954. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24425-4_65
Download citation
DOI: https://doi.org/10.1007/978-3-642-24425-4_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24424-7
Online ISBN: 978-3-642-24425-4
eBook Packages: Computer ScienceComputer Science (R0)