[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Automatic document classification based on latent semantic analysis

Published: 15 March 2023 Publication History

Abstract

In this paper, the problem of automatic document classification by a set of given topics is considered. The method proposed is based on the use of the latent semantic analysis to retrieve semantic dependencies between words. The classification of document is based on these dependencies. The results of experiments performed on the basis of the standard test data set TREC (Text REtrieval Conference) confirm the attractiveness of this approach. The relatively low computational complexity of this method at the classification stage makes it possible to be applied to the classification of document streams.

References

[1]
Ilander, F., Palm, J., and Fahraus, E.,The Private Filtering News Agent, 1997.
[2]
Foltz, P.W., Using Latent Semantic Indexing for Information Filtering,Proc. ACM Conf. on Office Information Systems (COIS), 1990, pp. 40–47.
[3]
Gallan, J., Learning while Filtering Documents,Proc. SIGIR'98, Melbourne, 1998, pp. 224–231.
[4]
Merkl D. Text Data Mining A Handbook of Natural Language Processing: Techniques and Applications for the Processing of Language as Text 1998 Moscow Marcel Dekker
[5]
Weiss, S.A., Kasif, S., and Brill, E.,Text Classification in USENET Newsgroups: A Progress Report.
[6]
Daphen, K. and Mehran, S.,Hierarchically Classifying Documents Using Very Few Words.
[7]
Lewis, D. and Ringuette, M., A Comparison of Two Learning Algorithms for Text Categorization,Proc. Third Annual Symp. on Document Analysis and Information Retrieval, 1994, pp. 81–93.
[8]
Yang, Y. and Pederson, J., Feature Selection in Statistical Learning of Text Categorization,Proc. ICML'97, 1997, pp. 412–420.
[9]
Baker, L.D. and McCallum, A.K., Distributional Clustering of Words for Text Classification,Proc. SIGIR'98, 1998, pp. 96–103.
[10]
Papka, R. and Allan, J., Document Classification Using Multiword Features,Proc. ACM Int. Conf. on Information and Knowledge Management (CIKM-98), New York, 1998, pp. 124–131.
[11]
Merkl, D., Lessons Learned in Text Document Classification,Proc. Workshop on Self-Organizing Maps (WSOM'97), Helsinki, 1997, pp. 316–321.
[12]
Landauer, T., Foltz, P., and Laham, D., An Introduction to Latent Semantic Analysis, inDiscourse Processes, vol. 25, pp. 259–284.
[13]
Harman, D., Latent Semantic Indexing and TREC-2,Proc. Second Text REtrieval Conf., 1994.
[14]
Dumais, S., Latent Semantic Indexing: TREC-3 report,Proc. Third Text REtrieval Conf., 1995.
[15]
Cullum J. and Wilougby R. Real Rectangular Matrix Lanczos Algorithms for Large Symmetric Eigenvalue Computations 1985 Boston Birkhauser
[16]
Dumais, S.,Improving the Retrieval of Information from External Sources, 1991.
[17]
Joachims, T., A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization,Proc. Int. Conf. on Machine Learning (ICML), 1997.
[18]
Voorhees, E. and Harman, D., Overview of the Sixth Text REtrieval Conf. (TREC-6),Proc. Sixth Text Retrieval Conference, 1998.
[19]
Berry M. Large Scale Singular Value Computations Int. J. Supercomputer Appl. 1992 6 13-49

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Programming and Computing Software
Programming and Computing Software  Volume 26, Issue 4
Jul 2000
60 pages

Publisher

Plenum Press

United States

Publication History

Published: 15 March 2023

Author Tags

  1. Latent Semantic Analysis
  2. Latent Semantic Indexing
  3. Hypothesis Space
  4. Topic Description
  5. Semantic Proximity

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media