Abstract
In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and K-Means clustering) and present how we integrated them into a system for semi-automatic topic ontology construction. The OntoGen system offers support to the user during the construction process by suggesting topics and analyzing them in real time. It suggests names for the topics in two alternative ways both based on extracting keywords from a set of documents inside the topic. The first set of descriptive keyword is extracted using document centroid vectors, while the second set of distinctive keyword is extracted from the SVM classification model dividing documents in the topic from the neighboring documents.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agirre, E., Ansa, O., Hovy, E., Martinez., D.: Enriching Very Large Ontologies Using the WWW. In: Proceedings of the Ontology Learning Workshop, The 14th European Conference on Artificial Inteligence (ECAI), Berlin, Germany (2000)
Bisson, G., Nedellec, C., Canamero, L.: Designing clustering methods for ontology building: The Mo’K workbench. In: Proceedings of the Ontology Learning Workshop, The 14th European Conference on Artificial Inteligence (ECAI), Berlin, Germany (2000)
Brank, J., Grobelnik, M., Milic-Frayling, N., Mladenic, D.: Feature selection using support vector machines. In: Proceedings of the 3rd International Conference on Data Mining Methods and Databases for Engineering, Finance, and Other Fields, Bologna, Italy (2002)
Cimiano, P., Pivk, A., Schmidt-Thieme, L., Staab, S.: Learning Taxonomic Relations from Heterogeneous Evidence. In: Proceedings of the Ontology Learning and Population Workshop, The 16th European Conference on Artificial Inteligence (ECAI), Valenci, Spain (2004)
Deerwester, S., Dumais, S., Furnas, G., Landuer, T., Harshman, R.: Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)
Douglas, B., Guha, L.R.V.: Building Large Knowledge-Based Systems. Addison Wesley, Reading (1990)
Lpez, M.F.: Overview of the methodologies for building ontologies. In: Proceedings of the Ontologies and Problem-Solving Methods Workshop, The 16th International Joint Conference on Artificial Inteligence (IJCAI), Stockholm, Sweden (1999)
Fortuna, B., Grobelnik, M., Mladenic, D.: Visualization of text document corpus. Informatica 29, 497–502 (2005)
Fortuna, B., Grobelnik, M., Mladenic, D.: Background Knowledge for Ontology Construction. In: Poster at 16th International World Wide Web Conference (WWW 2006), Edinburgh, Scotland (2006)
Grobelnik, M., Mladenic, D.: Efficient visualization of large text corpora. In: Proceedings of the 17th TELRI seminar, Dubrovnik, Croatia (2002)
Heyer, G., Läuter, M., Quasthoff, U., Wittig, T., Wolff, C.: Learning Relations using Collocations. In: Proceedings of Workshop on Ontology Learning, The 17th International Joint Conference on Artificial Inteligence (IJCAI), Seattle, USA (2001)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)
Joachims, T.: Making large-scale svm learning practical. In: Scholkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge (1998)
Leskovec, J., Grobelnik, M., Milic-Frayling, N.: Learning Semantic Graph Mapping for Document Summarization. In: Proceedings of Workshop on Knowledge Discovery and Ontologies, 15th European Conference on Machine Learning (ECML) and 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Pisa, Italy (2004)
Maedche, A., Staab, S.: Discovering conceptual relations from text. In: The 14th European Conference on Artificial Inteligence (ECAI), Berlin, Germany, pp. 321–325 (2000)
Reinberger, M.-L., Spyns, P.: Discovering Knowledge in Texts for the learning of DOGMA-inspired ontologies. In: Proceedings of the Ontology Learning and Population Workshop, The 16th European Conference on Artificial Inteligence (ECAI), Valenci, Spain (2004)
Salton, G.: Developments in Automatic Text Retrieval. Science 253, 974–979 (1991)
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: Proceedings of KDD Workshop on Text Mining, 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Boston, USA (2000)
Uschold, M.: Towards a Methodology for Building Ontologies. In: Workshop on Basic Ontological Issues in Knowledge Sharing, The 14th International Joint Conference on Artificial Inteligence (IJCAI), Motnreal, Canada (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fortuna, B., Mladenič, D., Grobelnik, M. (2006). Semi-automatic Construction of Topic Ontologies. In: Ackermann, M., et al. Semantics, Web and Mining. EWMF KDO 2005 2005. Lecture Notes in Computer Science(), vol 4289. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11908678_8
Download citation
DOI: https://doi.org/10.1007/11908678_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-47697-9
Online ISBN: 978-3-540-47698-6
eBook Packages: Computer ScienceComputer Science (R0)