Abstract
In this paper, we focus on the class of graph-based clustering models, such as growing neural gas or idiotypic nets for the purpose of high-dimensional text data clustering. We present a novel approach, which does not require operation on the complex overall graph of clusters, but rather allows to shift majority of effort to context-sensitive, local subgraph and local sub-space processing. Savings of orders of magnitude in processing time and memory can be achieved, while the quality of clusters is improved, as presented experiments demonstrate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bezdek, J.C., Pal, S.K.: Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. IEEE, New York (1992)
Ciesielski, K., Draminski, M., Klopotek, M., Kujawiak, M., Wierzchon, S.: Mapping document collections in non-standard geometries. In: De Beats, B., De Caluwe, R., de Tre, G., Fodor, J., Kacprzyk, J., Zadrony, S. (eds.) Current Issues in Data and Knowledge Engineering, pp. 122–132. Akademicka Oficyna Wydawnicza EXIT Publishing, Warszawa (2004)
Ciesielski, K., Wierzchoń, S.T., Kłopotek, M.A.: An Immune Network for Contextual Text Data Clustering. In: Bersini, H., Carneiro, J. (eds.) ICARIS 2006. LNCS, vol. 4163, pp. 432–445. Springer, Heidelberg (2006)
Dittenbach, M., Rauber, A., Merkl, D.: Uncovering hierarchical structure in data using the Growing Hierarchical Self-Organizing Map. Neurocomputing 48(1-4), 199–216 (2002)
Dorigo, M., Di Caro, G.: The Ant Colony Optimization Meta-Heuristic. In: Corne, D., Dorigo, M., Glover, F. (eds.) New Ideas in Optimization, pp. 11–32. McGraw-Hill, New York (1999)
Fritzke, B.: A growing neural gas network learns topologies. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 625–632. MIT Press, Cambridge (1995)
Fritzke, B.: A self-organizing network that can follow non-stationary distributions. In: Fritzke, B. (ed.) Proceeding of the International Conference on Artificial Neural Networks 1997, pp. 613–618. Springer, Heidelberg (1997)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17(2-3), 107–145 (2001)
Hung, C., Wermter, S.: A constructive and hierarchical self-organising model in a non-stationary environment. In: International Joint Conference in Neural Networks (2005)
Klopotek, M., Draminski, M., Ciesielski, K., Kujawiak, M., Wierzchon, S.T.: Mining document maps. In: Gori, M., Celi, M., Nanni, M. (eds.) Proceedings of Statistical Approaches to Web Mining Workshop (SAWM) at PKDD 2004, Pisa, pp. 87–98 (2004)
Klopotek, M., Wierzchon, S., Ciesielski, K., Draminski, M., Czerski, D.: Conceptual maps and intelligent navigation in document space, monography. Akademicka Oficyna Wydawnicza EXIT Publishing, Warszawa (to appear, 2006)
Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (2001)
Kohonen, T., Kaski, S., Somervuo, P., Lagus, K., Oja, M., Paatero, V.: Self-organization of very large document collections, Helsinki University of Technology technical report (2003), http://www.cis.hut.fi/research/reports/biennial02-03
Rauber, A.: Cluster Visualization in Unsupervised Neural Networks. Diplomarbeit, Technische Universität Wien, Austria (1996)
Zhao, Y., Karypis, G.: Criterion functions for document clustering: Experiments and analysis, available at: http://www.users.cs.umn.edu/~karypis/publications/ir.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ciesielski, K., Kłopotek, M.A. (2006). Text Data Clustering by Contextual Graphs. In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds) Discovery Science. DS 2006. Lecture Notes in Computer Science(), vol 4265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893318_10
Download citation
DOI: https://doi.org/10.1007/11893318_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46491-4
Online ISBN: 978-3-540-46493-8
eBook Packages: Computer ScienceComputer Science (R0)