[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/564376.564395acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Bayesian online classifiers for text classification and filtering

Published: 11 August 2002 Publication History

Abstract

This paper explores the use of Bayesian online classifiers to classify text documents. Empirical results indicate that these classifiers are comparable with the best text classification systems. Furthermore, the online approach offers the advantage of continuous learning in the batch-adaptive text filtering task.

References

[1]
G. Cauwenberghs and T. Poggio. Incremental and decremental support vector machine learning. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, NIPS 2000, volume 13. The MIT Press, 2001.
[2]
D. Cox and E. Snell. Analysis of Binary Data. Chapman & Hall, London, 2nd edition, 1989.
[3]
L. Csató and M. Opper. Sparse representation for Gaussian process models. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, NIPS 2000, volume 13. The MIT Press, 2001.
[4]
H. Drucker, B. Shahrary, and D. C. Gibbon. Relevance feedback using support vector machines. In Proceedings of the 2001 International Conference on Machine Learning, 2001.
[5]
T. E. Dunning. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1):61--74, 1993.
[6]
W. Hersh, C. Buckley, T. Leone, and D. Hickam. OHSUMED: An interactive retrieval evaluation and new large test collection for research. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 192--201, 1994.
[7]
T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning (ECML), pages 137--142, 1998.
[8]
T. Joachims. Making large-scale SVM learning practical. In B. Schólkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods --- Support Vector Learning, chapter 11. The MIT Press, 1999.
[9]
D. D. Lewis. Representation and Learning in Information Retrieval. PhD thesis, Department of Computer and Information Science, University of Massachusetts at Amherst, 1992.
[10]
D. D. Lewis. Evaluating and optimizing automomous text classification systems. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 246--254, 1995.
[11]
D. D. Lewis, R. E. Schapire, J. P. Callan, and R. Papka. Training algorithms for linear text classifiers. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 298--306, 1996.
[12]
D. J. Mackay. Bayesian interpolation. Neural Computation, 4(3):415--447, 1991.
[13]
D. J. Mackay. Information-based objective functions for active data selection. Neural Computation, 4(4):590--604, 1992.
[14]
R. M. Neal. Monte Carlo implementation of Gaussian process models for Bayesian regression and classification. Technical Report CRG-TR-97-2, Department of Computer Science, University of Toronto, January 1997.
[15]
H. T. Ng, W. B. Goh, and K. L. Low. Feature selection, perceptron learning, and a usability case study for text categorization. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 67--73, 1997.
[16]
M. Opper. Online versus offline learning from random examples: General results. Physical Review Letters, 77:4671--4674, 1996.
[17]
M. Opper. A Bayesian approach to online learning. In D. Saad, editor, On-Line Learning in Neural Networks. Combridge University Press, 1998.
[18]
S. Robertson and D. A. Hull. The TREC-9 filtering track final report. In Proceedings of the 9th Text REtrieval Conference (TREC-9), pages 25--40, 2001.
[19]
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513--523, 1988.
[20]
S. A. Solla and O. Winther. Optimal perceptron learning: an online Bayesian approach. In D. Saad, editor, On-Line Learning in Neural Networks. Combridge University Press, 1998.
[21]
N. A. Syed, H. Liu, and K. K. Sung. Incremental learning with support vector machines. In Proceedings of the Workshop on Support Vector Machines at the International Joint Conference on Artificial Intelligence (IJCAI-99), 1999.
[22]
C. van Rijsbergen. Information Retrieval. Butterworths, London, 1979.
[23]
V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.
[24]
C. K. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, NIPS 2000, volume 13. The MIT Press, 2001.
[25]
O. Winther. Bayesian Mean Field Algorithms for Neural Networks and Gaussian Processes. PhD thesis, University of Copenhagen, CONNECT, The Niels Bohr Institute, 1998.
[26]
Y. Yang. A study on thresholding strategies for text categorization. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 137--145, 2001.
[27]
Y. Yang and X. Liu. A re-examination of text categorization methods. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 42--49, 1999.

Cited By

View all
  • (2023)Ensemble Active Learning by Contextual Bandits for AI Incubation in ManufacturingACM Transactions on Intelligent Systems and Technology10.1145/362782115:1(1-26)Online publication date: 19-Dec-2023
  • (2022)MACHINE LEARNING FOR TEXT CLASSIFICATION IN BUILDING MANAGEMENT SYSTEMSJOURNAL OF CIVIL ENGINEERING AND MANAGEMENT10.3846/jcem.2022.1601228:5(408-421)Online publication date: 12-May-2022
  • (2022)Identification of informational and probabilistic independence by adaptive thresholdingIntelligent Data Analysis10.3233/IDA-21594226:5(1139-1160)Online publication date: 1-Jan-2022
  • Show More Cited By

Index Terms

  1. Bayesian online classifiers for text classification and filtering

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
      August 2002
      478 pages
      ISBN:1581135610
      DOI:10.1145/564376
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 August 2002

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. bayesian
      2. machine learning
      3. online
      4. text classification
      5. text filtering

      Qualifiers

      • Article

      Conference

      SIGIR02
      Sponsor:

      Acceptance Rates

      SIGIR '02 Paper Acceptance Rate 44 of 219 submissions, 20%;
      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 16 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Ensemble Active Learning by Contextual Bandits for AI Incubation in ManufacturingACM Transactions on Intelligent Systems and Technology10.1145/362782115:1(1-26)Online publication date: 19-Dec-2023
      • (2022)MACHINE LEARNING FOR TEXT CLASSIFICATION IN BUILDING MANAGEMENT SYSTEMSJOURNAL OF CIVIL ENGINEERING AND MANAGEMENT10.3846/jcem.2022.1601228:5(408-421)Online publication date: 12-May-2022
      • (2022)Identification of informational and probabilistic independence by adaptive thresholdingIntelligent Data Analysis10.3233/IDA-21594226:5(1139-1160)Online publication date: 1-Jan-2022
      • (2022)SARS-CoV-2 Prediction of Outbreak and Analysis Using Machine LearningProceedings of Third International Conference on Intelligent Computing, Information and Control Systems10.1007/978-981-16-7330-6_66(899-912)Online publication date: 15-Mar-2022
      • (2018)Assessing cyber-incidents using machine learningInternational Journal of Information and Computer Security10.5555/3292793.329279510:4(341-360)Online publication date: 1-Jan-2018
      • (2018)Proposing Enhanced Feature Engineering and a Selection Model for Machine Learning ProcessesApplied Sciences10.3390/app80406468:4(646)Online publication date: 20-Apr-2018
      • (2018)Binoclt: A New Binomial Classification Scheme for Long-Text Mining in Online Social Network2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2018.8710994(1-8)Online publication date: Nov-2018
      • (2018)Segregation of Similar and Dissimilar Live RSS News Feeds Based on Similarity MeasuresData Management, Analytics and Innovation10.1007/978-981-13-1274-8_26(333-344)Online publication date: 8-Sep-2018
      • (2018)Knowledge-Based Metrics for Document Classification: Online Reviews ExperimentsIntelligent Distributed Computing XII10.1007/978-3-319-99626-4_37(425-435)Online publication date: 15-Sep-2018
      • (2017)Multi-label text classification based on the label correlation mixture modelIntelligent Data Analysis10.3233/IDA-16305521:6(1371-1392)Online publication date: 15-Nov-2017
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media