[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1456223.1456229acmotherconferencesArticle/Chapter ViewAbstractPublication PagescststConference Proceedingsconference-collections
research-article

Document classification system based on HMM word map

Published: 28 October 2008 Publication History

Abstract

In this article, a system based on Hidden Markov Models (HMM) for document organization is presented. The purpose of the system is the classification of a document collection in terms of document content. The system possesses a two-level hybrid connectionist architecture that comprises (i) an automatically created word map using a HMM, which functions as a feature extraction module and (ii) a supervised MLP-based classifier, which provides the final classification result. A series of experiments, which have been performed on Modern Greek text-only documents, is presented. These experiments illustrate the effectiveness of the proposed system.

References

[1]
Brants T., "Tagging and Parsing with Cascaded Markov Models - Automation of Corpus Annotation", Saarbrócken Dissertations in Computational Linguistics and Language Technology, Vol. 6. German Research Center for Artificial Intelligence and Saarland University, Saarbrócken, Germany, 1999.
[2]
Georgakis A., Kotropoulos C., Xafopoulos A., and Pitas I., "Marginal median SOM for document organization and retrieval", Neural Networks, Vol. 17, No. 3, pp. 365--377, 2004.
[3]
Kaski S., "Dimensionality Reduction by Random mapping: Fast Similarity Computation for Clustering." In Proceedings of IJCNN'98, International Joint Conference on Neural Networks, Vol. 1, pp. 413--418, 1998.
[4]
Kohonen T., "Self-Organizing Maps", Springer Series in Information Sciences, Vol. 30, Springer, Heidelberg, 1st ed., 1995; 2nd., 1997
[5]
Kohonen T., "Self-Organized formation of topologically correct feature maps", Biological Cybernetics, Vol. 43, pp. 59--69, 1982.
[6]
Kohonen T., Kaski S., Lagus K., Salojärvi, Honkela J., Patero V. and Saarela A., "Self-Organisation of a Massive Document Collection." IEEE Transactions on Neural Networks, Vol. 11, No. 3, pp. 574--585, 2000.
[7]
Nguyen D. and Widrow B., "Improving the learning speed of 2-layer neural networks by choosing initial values of adaptive weights", Proceedings of the International Joint Conference on Neural Networks, Vol. 3, pp. 21--26, 1990.
[8]
Papageorgiou H., Prokopidis P., Giouli V. and Piperidis S., "A Unified PoS Tagging Architecture and its Application to Greek", Second International Conference on Language Resources and Evaluation Proceedings, Athens, Greece, Vol. 3, pp. 1455--1462, 2000.
[9]
Rabiner L. R., "A tutorial on HMM and selected applications in speech recognition". In Proc. IEEE, Vol. 77, No. 2, pp. 257--286, Feb. 1989.
[10]
Riedmiller M. and Braun H., "A direct adaptive method for faster backpropagation learning: the RPROP algorithm", Proceedings of the IEEE International Conference on Neural Networks, San Francisco, pp. 586--591, 1993.
[11]
Tambouratzis G. and Vassiliou M., "Employing Thematic Variables for Enhancing Classification Accuracy Within Author Discrimination Experiments." Literary and Linguistic Computing, Vol. 22, No. 2, pp. 207--224, 2007
[12]
Tsimboukakis N. and Tambouratzis G., "Self-Organizing Word Map for Context-Based Document Classification", In Proceedings of WSOM'07, 6th International Workshop on Self-Organizing Maps, 2007

Cited By

View all
  • (2022)Improving Short Query Representation in LDA Based Information Retrieval SystemsHybrid Artificial Intelligent Systems10.1007/978-3-031-15471-3_10(111-122)Online publication date: 12-Sep-2022
  • (2020)LDA filter: A Latent Dirichlet Allocation preprocess method for WekaPLOS ONE10.1371/journal.pone.024170115:11(e0241701)Online publication date: 9-Nov-2020
  • (2020)An HMM-based synthetic view generator to improve the efficiency of ensemble systemsLogic Journal of the IGPL10.1093/jigpal/jzz067Online publication date: 13-Jan-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSTST '08: Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
October 2008
733 pages
ISBN:9781605580463
DOI:10.1145/1456223
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • The French Chapter of ACM Special Interest Group on Applied Computing
  • Ministère des Affaires Etrangères et Européennes
  • Région Ile de France
  • Communauté d'Agglomération de Cergy-Pontoise
  • Institute of Electrical and Electronics Engineers Systems, Man and Cybernetics Society
  • The European Society For Fuzzy And technology
  • Institute of Electrical and Electronics Engineers France Section
  • Laboratoire des Equipes Traitement des Images et du Signal
  • AFIHM: Ass. Francophone d'Interaction Homme-Machine
  • The International Fuzzy System Association
  • Laboratoire Innovation Développement
  • University of Cergy-Pontoise
  • The World Federation of Soft Computing
  • Agence de Développement Economique de Cergy-Pontoise
  • The European Neural Network Society
  • Comité d'Expansion Economique du Val d'Oise

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hidden Markov models
  2. multi-layer perceptron
  3. text classification
  4. word clustering

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Improving Short Query Representation in LDA Based Information Retrieval SystemsHybrid Artificial Intelligent Systems10.1007/978-3-031-15471-3_10(111-122)Online publication date: 12-Sep-2022
  • (2020)LDA filter: A Latent Dirichlet Allocation preprocess method for WekaPLOS ONE10.1371/journal.pone.024170115:11(e0241701)Online publication date: 9-Nov-2020
  • (2020)An HMM-based synthetic view generator to improve the efficiency of ensemble systemsLogic Journal of the IGPL10.1093/jigpal/jzz067Online publication date: 13-Jan-2020
  • (2020)Using hidden Markov model to predict recurrence of breast cancer based on sequential patterns in gene expression profilesJournal of Biomedical Informatics10.1016/j.jbi.2020.103570111(103570)Online publication date: Nov-2020
  • (2016)An HMM-Based Multi-view Co-training Framework for Single-View Text CorporaHybrid Artificial Intelligent Systems10.1007/978-3-319-32034-2_6(66-78)Online publication date: 14-Apr-2016
  • (2015)TCBR-HMMApplied Soft Computing10.1016/j.asoc.2014.10.01926:C(463-473)Online publication date: 1-Jan-2015
  • (2014)T-HMM: A Novel Biomedical Text Classifier Based on Hidden Markov Models8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014)10.1007/978-3-319-07581-5_27(225-234)Online publication date: 2014

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media