[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.3115/1072228.1072383dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free access

Multi-dimensional text classification

Published: 24 August 2002 Publication History

Abstract

This paper proposes a multi-dimensional framework for classifying text documents. In this framework, the concept of multidimensional category model is introduced for representing classes. In contrast with traditional flat and hierarchical category models; the multi-dimensional category model classifies each text document in a collection using multiple predefined sets of categories, where each set corresponds to a dimension. Since a multi-dimensional model can be converted to flat and hierarchical models, three classification strategies are possible, i.e., classifying directly based on the multi-dimensional model and classifying with the equivalent flat or hierarchical models. The efficiency of these three classifications is investigated on two data sets. Using k-NN, naïve Bayes and centroid-based classifiers, the experimental results show that the multi-dimensional-based and hierarchical-based classification performs better than the flat-based classifications.

References

[1]
Chuang W. T. et al. (2000), A Fast Algorithm for Hierarchical Text Classification. Data Warehousing and Knowledge Discovery, 409--418.]]
[2]
Dumais S. T. and Chen H. (2000) Hierarchical Classification of Web Content, In Proc. of the 23rd International ACM SIGIR, pp. 256--263.]]
[3]
Eui-Hong H. and Karypis G. (2000) Centroid-Based Document Classification: Analysis & Experimental Results. In Proc. of European Conference on PKDD, pp. 424--431.]]
[4]
Jiawei H. and Micheline K. (2001) Data Mining: Concepts and Techniques. Morgan Kaufmann publishers.]]
[5]
Lewis D. D. and Ringuette M. (1994) A Comparison of Two Learning Algorithms for Text Categorization. In Proc. of Third Annual Symposium on Document Analysis and Information Retrieval, pages 81--93.]]
[6]
McCallum A. et al. (1998) Improving Text Classification by Shrinkage in a Hierarchy of Classes, In Proc. of the 15th International Conference on Machine Learning, pp. 359--367.]]
[7]
Theeramunkong T. and Lertnattee V. (2001) Improving Centroid-Based Text Classification Using Term-Distribution-Based Weighting System and Clustering. In Proc. of International Symposium on Communications and Information Technology (ISCIT 2001), pp. 33--36.]]
[8]
Wiener E. D. et al. (1995) A Neural Network Approach to Topic Spotting. In Proc. of SDAIR-95, the 4th Annual Symposium on Document Analysis and Information Retrieval. pp. 317--332.]]
[9]
Yang Y. and Chute C. G. (1992) A Linear Least Square Fit Mapping Method for Information Retrieval from Natural Language Texts. In Proc. of the 14th International Conference on Computational Linguistics, pp. 358--362.]]
[10]
Yang, Y. and Liu X. (1999) A Re-examination of Text Categorization Methods. In Proc. of the 22nd ACM SIGIR Conference, 42--49.]]

Cited By

View all
  • (2018)Multi-dimensional classification via a metric approachNeurocomputing10.1016/j.neucom.2017.09.057275:C(1121-1131)Online publication date: 31-Jan-2018
  • (2003)A study on feature weighting in Chinese text categorizationProceedings of the 4th international conference on Computational linguistics and intelligent text processing10.5555/1791562.1791642(592-601)Online publication date: 16-Feb-2003
  • (2003)Chinese text categorization based on the binary weighting model with non-binary smoothingProceedings of the 25th European conference on IR research10.5555/1757788.1757827(408-419)Online publication date: 14-Apr-2003

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
COLING '02: Proceedings of the 19th international conference on Computational linguistics - Volume 1
August 2002
1184 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 24 August 2002

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)49
  • Downloads (Last 6 weeks)3
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Multi-dimensional classification via a metric approachNeurocomputing10.1016/j.neucom.2017.09.057275:C(1121-1131)Online publication date: 31-Jan-2018
  • (2003)A study on feature weighting in Chinese text categorizationProceedings of the 4th international conference on Computational linguistics and intelligent text processing10.5555/1791562.1791642(592-601)Online publication date: 16-Feb-2003
  • (2003)Chinese text categorization based on the binary weighting model with non-binary smoothingProceedings of the 25th European conference on IR research10.5555/1757788.1757827(408-419)Online publication date: 14-Apr-2003

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media