Abstract
The capacity of a clustering model can be defined as the ability to represent complex spatial data distributions. We introduce a method to quantify the capacity of an approximate spectral clustering model based on the eigenspectrum of the similarity matrix, providing the ability to measure capacity in a direct way and to estimate the most suitable model parameters. The method is tested on simple datasets and applied to a forged banknote classification problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Buhmann, J., Tishby, N.: Empirical risk approximation a statistical learning theory of data clustering. NATO ASI Ser. Ser. F: Comput. Syst. Sci. 57–68 (1998)
Chung, F.R.K.: Spectral Graph Theory (CBMS Regional Conference Series in Mathematics, No. 92). American Mathematical Society, Providence (1997)
De Silva, V., Tenenbaum, J.B.: Sparse multidimensional scaling using landmark points. Technical report, Stanford University (2004)
Decelle, A., Krzakala, F., Moore, C., Zdeborová, L.: Inference and phase transitions in the detection of modules in sparse networks. Phys. Rev. Lett. 107(6), 065701 (2011)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annu. Eugen. 7(Part II), 179–188 (1936)
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Kluwer, Boston (1992)
Gray, R.: Vector quantization. IEEE Acoust. Speech Signal Process. Mag. 1, 4–29 (1984)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intell. Inf. Syst. 17(2–3), 107–145 (2001)
Handl, J., Knowles, J., Kell, D.B.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21(15), 3201 (2005)
Lee, M.D.: On the complexity of additive clustering models. J. Math. Psychol. 45(1), 131–148 (2001)
Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Rose, K., Gurewitz, E., Fox, G.: Statistical mechanics and phase transitions in clustering. Phys. Rev. Lett. 65, 945–948 (1990)
Rovetta, S.: Model complexity control in clustering. In: Bassis, S., Esposito, A., Morabito, F.C., Pasero, E. (eds.) Advances in Neural Networks. SIST, vol. 54, pp. 111–120. Springer, Cham (2016). doi:10.1007/978-3-319-33747-0_11
Torgerson, W.S.: Theory and Methods of Scaling. Wiley, New York (1958)
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
Wigner, E.P.: Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 62, 548–564 (1955)
Yan, D., Huang, L., Jordan, M.I.: Fast approximate spectral clustering. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 907–916. ACM (2009)
Zhou, B., Trinajstić, N.: On the largest eigenvalue of the distance matrix of a connected graph. Chem. Phys. Lett. 447(4), 384–387 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Rovetta, S., Masulli, F., Cabri, A. (2017). Measuring Clustering Model Complexity. In: Lintas, A., Rovetta, S., Verschure, P., Villa, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2017. ICANN 2017. Lecture Notes in Computer Science(), vol 10614. Springer, Cham. https://doi.org/10.1007/978-3-319-68612-7_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-68612-7_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68611-0
Online ISBN: 978-3-319-68612-7
eBook Packages: Computer ScienceComputer Science (R0)