Abstract
We present a new generalization bound where the use of unlabeled examples results in a better ratio between training-set size and the resulting classifier’s quality and thus reduce the number of labeled examples necessary for achieving it. This is achieved by demanding from the algorithms generating the classifiers to agree on the unlabeled examples. The extent of this improvement depends on the diversity of the learners—a more diverse group of learners will result in a larger improvement whereas using two copies of a single algorithm gives no advantage at all. As a proof of concept, we apply the algorithm, named AgreementBoost, to a web classification problem where an up to 40% reduction in the number of labeled examples is obtained.
A full version of this paper is available on http://www.illc.uva.nl/Publications/ ResearchReports/MoL-2005-02.text.pdf
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Meir, R., Rätsch, G.: An Introduction to Boosting and Leveraging. Advanced lectures on machine learning, 118–183 (ISBN:3-540-00529-3)
Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian Complexities: Risk bounds and Structural Results. The Journal of Machine Learning Research 3, 463–482 (2003)
Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. The Annals of Statistics 30(1) (February 2002)
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. In: Machine Learning: Proceedings of the Fourteenth Fourteenth International Conference (1997)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory, ACM, New York (1998)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM 36(4) (1989)
Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Learning to classify text from labeled and unlabeled documents. In: Proc. of the 5th National Conference on Artificial Intelligence, AAAI Press, Menlo Park (1998)
Luenberger, D.: Introduction to Linear and Nonlinear Programming. Addison-Wesley publishing company, Reading (1973) (ISBN 0-201-04347-5)
Zhang, T., Oles, F.: A probability analysis on the value of unlabeled data for classification problems. In: Proc. of the Int. Conference on Machine Learning (2000)
Park, S.-B., Zhang, B.-T.: Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information. Information Processing and Management: an International Journal 40(3) (2004)
Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proc. of the 9th int. conference on Information and knowledge management (2000)
Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: International Joint Conference on Machine Learning (2000)
Hwa, R., Osborne, M., Sarkar, A., Steedman, M.: Corrected Co-training for Statistical Parsers. In: Proc. of the Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining. International Conference of Machine Learning, Washington D.C (2003)
Pierce, D., Cardie, C.: Limitations of Co-Training for Natural Language Learning from Large Datasets. In: Proc. of the Conference on Empirical Methods in Natural Language Processing (2001)
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proc. of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (1999)
Dasgupta, S., Littman, M.L., McAllester, D.A.: PAC Generalization Bounds for Co-training. Advances in Neural Information Processing Systems 14 (2001)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1) (1997)
Mason, L., Baxter, J., Bartlett, P.L., Frean, M.: Boosting algorithms as gradient descent in function space. Technical report, RSISE, Australian National University (1999)
Bennett, K.P., Demiriz, A., Maclin, R.: Exploiting unlabeled data in ensemble methods. In: Proceedings of the eighth ACM SIGKDD int. conference on Knowledge discovery and data mining (2002)
Rätsch, G., Mika, S., Warmuth, M.K.: On the convergence of leveraging. NeuroCOLT2 Technical Report 98, Royal Holloway College, London (August 2001)
Levin, A., Viola, P., Freund, Y.: Unsupervised Improvement of Visual Detectors using Co-Training. In: Int. Conference on Computer Vision (ICCV), Nice, France (October 2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leskes, B. (2005). The Value of Agreement, a New Boosting Algorithm. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_7
Download citation
DOI: https://doi.org/10.1007/11503415_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26556-6
Online ISBN: 978-3-540-31892-7
eBook Packages: Computer ScienceComputer Science (R0)