[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1867406.1867498guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Active learning with committees for text categorization

Published: 27 July 1997 Publication History

Abstract

In many real-world domains, supervised learning requires a large number of training examples. In this paper, we describe an active learning method that uses a committee of learners to reduce the number of training examples required for learning. Our approach is similar to the Query by Committee framework, where disagreement among the committee members on the predicted label for the input part of the example is used to signal the need for knowing the actual value of the label. Our experiments are conducted in the text categorization domain, which is characterized by a large number of features, many of which are irrelevant. We report here on experiments using a committee of Winnow-based learners and demonstrate that this approach can reduce the number of labeled training examples required over that used by a single Winnow learner by 1-2 orders of magnitude.

References

[1]
Dana Angluin, Queries and Concept Learning, Machine Learning 2:319-342, 1988
[2]
Chidanand Apté, Fred Damerau, Sholom M. Weiss, Automated Learning of Decision Rules for Text Categorization, ACM TOIS 12(2):233-251, July 1994
[3]
Raymond A. Board, Leonard Pitt, Semi-Supervised Learning, Department of Computer Science, University of Illinois at Urbana-Champaign, Report No. UIUCDCS-R-87-1372, September 1987
[4]
Leo Breiman, Bagging Predilctors, Machine Learning 24(2):123-140, August 1996
[5]
David Cohn, Les Atlas, Richard Ladner, Improving Generalization with Active Learning, Machine Learning 15(2):201-221, May 1994
[6]
W. Bruce Croft, Effective Text Retrieval Based on Combining Evidence from the Corpus and Users, IEEE Expert 10(6):59-63, December 1995
[7]
Ido Dagan, Sean P. Engelson, Committee-Based Sampling for Training Probabilistic Classifiers, in Proceedings: ICML95, 1995, p. 150-157
[8]
Yoav Freund, H. Sebastian Seung, Eli Shamir, Naftali Tishby, Information, Prediction, and Query by Committee, NIPS92, p. 483-490
[9]
Yoav Freund, H. Sebastian Seung, Eli Shamir, Naftali Tishby, Selective Sampling Using the Query by Committee Algorithm, July 1995, to appear in Machine Learning
[10]
Phillip J. Hayes, Peggy M. Andersen, Irene B. Nirenburg, Linda M. Schmandt, TCS: A Shell for Content-Based Text Categorization, in Proceedings of the 6th IEEE CAIA, 1990, IEEE, p. 320-326
[11]
David D. Lewis, Representation and Learning in Information Retrieval, Ph.D. Thesis, University of Massachusetts at Amherst, COINS TeGhnical Report 91-93, December 1991
[12]
David D. Lewis, William A. Gale, A Sequential Algorithm for Training Text Classifiers, in Proceedings: SIGIR'94, p. 3-12
[13]
Nick Littlestone, Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm, Machine Learning 2(4):285-:118, 1988
[14]
Nicholas Littlestone, Mistake Bounds and Logarithmic Linear-Threshold Learning Algorithms, University of California at Santa Cruz, UCSC-CRL-89-11, March 1989
[15]
Nick Littlestone, Redundant Noisy Attributes, Attribute Errors, and Linear-Threshold Learning Using Winnow, COLT'91, p. 147-156
[16]
Gary Perlman, ISTAT version 5.4, software and documentation, available from: ftp:/archive.- cis.ohic-state.edu/pub/stat/
[17]
Reuters-22173 corpus, a collection of 22,173 indexed documents appearing on the Reuters newswire in 1987; Reuters Ltd, Carnegie Group, David Lewis, Information Retrieval Laboratory at the University of Massachusetts; available via ftp from: ciir-ftp.- cs.umass.edu:/pub/reutersl/corpus.tar.Z
[18]
Dan Roth, Applying Winnow to Context-Sensitive Spelling Correction, ICML96, p. 182-190
[19]
H. S. Seung, M. Opper, H. Sompolinsky, Query by Committee, COLT92, p. 287-294

Cited By

View all
  • (2020)Using long short‐term memory neural networks to analyze SEC 13D filingsInternational Journal of Intelligent Systems in Accounting and Finance Management10.1002/isaf.146426:4(153-163)Online publication date: 14-Feb-2020
  • (2017)Modeling of learning curves with applications to POS taggingComputer Speech and Language10.1016/j.csl.2016.06.00141:C(1-28)Online publication date: 1-Jan-2017
  • (2015)Dynamic Active Learning Based on Agreement and Applied to Emotion Recognition in Spoken InteractionsProceedings of the 2015 ACM on International Conference on Multimodal Interaction10.1145/2818346.2820774(275-278)Online publication date: 9-Nov-2015
  • Show More Cited By
  1. Active learning with committees for text categorization

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    AAAI'97/IAAI'97: Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
    July 1997
    1085 pages
    ISBN:0262510952

    Publisher

    AAAI Press

    Publication History

    Published: 27 July 1997

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 31 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Using long short‐term memory neural networks to analyze SEC 13D filingsInternational Journal of Intelligent Systems in Accounting and Finance Management10.1002/isaf.146426:4(153-163)Online publication date: 14-Feb-2020
    • (2017)Modeling of learning curves with applications to POS taggingComputer Speech and Language10.1016/j.csl.2016.06.00141:C(1-28)Online publication date: 1-Jan-2017
    • (2015)Dynamic Active Learning Based on Agreement and Applied to Emotion Recognition in Spoken InteractionsProceedings of the 2015 ACM on International Conference on Multimodal Interaction10.1145/2818346.2820774(275-278)Online publication date: 9-Nov-2015
    • (2015)Cooperative learning and its application to emotion recognition from speechIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2014.237555823:1(115-126)Online publication date: 1-Jan-2015
    • (2013)A neuro-fuzzy immune inspired classifier for task-oriented textsJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.5555/2596266.259628325:3(673-683)Online publication date: 1-May-2013
    • (2013)Generalized batch mode active learning for face-based biometric recognitionPattern Recognition10.1016/j.patcog.2012.07.02546:2(497-508)Online publication date: 1-Feb-2013
    • (2012)EAGLEProceedings of the 9th international conference on The Semantic Web: research and applications10.1007/978-3-642-30284-8_17(149-163)Online publication date: 27-May-2012
    • (2011)Fuzzy semi-supervised support vector machinesProceedings of the 7th international conference on Machine learning and data mining in pattern recognition10.5555/2033831.2033842(127-139)Online publication date: 30-Aug-2011
    • (2011)Active learning using on-line algorithmsProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/2020408.2020553(850-858)Online publication date: 21-Aug-2011
    • (2009)Batch mode active learning based multi-view text classificationProceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 710.5555/1802134.1802239(472-476)Online publication date: 14-Aug-2009
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media