The strength of weak learnability

Robert E. Schapire¹

15k Accesses
2710 Citations
17 Altmetric
Explore all metrics

Abstract

This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model. A concept class islearnable (orstrongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class isweakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent.

A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error ο.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Angluin, D. (1980). Finding patterns common to a set of strings.J. of Computer and System Sciences,21, 46–62.
Google Scholar
Angluin, D. (1988). Queries and concept learning.Machine Learning,2, 319–342.
Google Scholar
Angluin, D. and Valiant, L.G. (1979). Fast probabilistic algorithms for Hamiltonian circuits and matchings.J. Computer and System Sciences,18, 155–193.
Google Scholar
Baum, E.B. (1989). On learning a union of half spaces. Unpublished manuscript.
Blumer, A., Ehrenfeucht, a., Haussler, D., and Warmuth, M.K. (1987). Occam's razor.Information Processing Letters,24, 377–380.
Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M.K. (1989). Learnability and the Vapnik-Chervonenkis dimension.J. of the Association for Computing Machinery,36, 929–965.
Google Scholar
Board, R. and Pitt, L. (1990). On the necessity of Occam algorithms. (In press)Proceedings of the Twenty-second Annual ACM Symposium on Theory of Computing. New York, NY: ACM Press.
Google Scholar
Boucheron, S. and Sallantin, J. (1988). Some remarks about space-complexity of learning, and circuit complexity of recognizing.Proceedings of the 1988 Workshop on Computational Learning Theory (pp. 125–138). San Mateo, CA: Morgan Kaufman.
Google Scholar
Ehrenfeucht, A. and Haussler, D. (1989). Learning decision trees from random examples.Information and Computation,3, 231–246.
Google Scholar
Floyd, S. (1989). Space-bounded learning and the Vapnik-Chervonenkis dimension.Proceedings of the Second Annual Workshop on Computational Learning Theory (pp. 349–364). San Mateo, CA: Morgan Kaufman.
Google Scholar
Haussler, D. (1988).Space efficient learning algorithms (Technical Report UCSC-CRL-88–2). Santa Cruz, CA: University of California, Baskin Center for Computer Engineering and Information Sciences.
Google Scholar
Haussler, D., Kearns, M., Littlestone, N., and Warmuth, M.K. (1988). Equivalence of models for polynomial learnability.Proceedings of the 1988 Workshop on Computational Learning Theory (pp. 42–55). San Mateo, CA: Morgan Kaufman.
Google Scholar
Haussler, D., Littlestone, N., and Warmuth, M.K. (1987). Expected mistake bounds for on-line learning algorithms. Unpublished manuscript.
Haussler, D., Littlestone, N., and Warmuth, M.K. (1988). Predicting {0, 1}-functions on randomly drawn points.Proceedings of the Twenty-Ninth Annual Symposium on Foundations of Computer Science (pp. 100–109). Washington, DC: IEEE Computer Society Press.
Google Scholar
Helmbold, D., Sloan, R., and Warmuth, M.K. (1990). Learning nested differences of intersection-closed concept classes.Machine Learning, 5, xxx-xxx.
Google Scholar
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables.J. of the American Statistical Association,58, 13–30.
Google Scholar
Kearns, M. (1988). Thoughts on hypothesis boosting. Unpublished manuscript.
Kearns, M. (1989).The Computational Complexity of Machine Learning. Doctoral dissertation, Department of Computer Science, Harvard University, Cambridge, MA.
Kearns, M., Li, M., Pitt, L., and Valiant, L. (1987). On the learnability of Boolean formulae.Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (pp. 285–295). New York, NY: ACM Press.
Google Scholar
Kearns, M. and Valiant, L.G. (1988).Learning Boolean formulae or finite automata is as hard as factoring (Technical Report TR-14–88). Cambridge, MA: Harvard University Aiken Computation Laboratory.
Google Scholar
Kearns, M. and Valiant, L.G. (1989). Cryptographic limitations on learning Boolean formulae and finite automata.Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing (pp. 433–444). New York, NY: ACM Press.
Google Scholar
Pitt, L. and Valiant, L.G. (1988). Computational limitations on learning from examples.J. of the Association for Computing Machinery,35, 965–984.
Google Scholar
Rivest, R.L. (1987). Learning decision lists.Machine Learning,2, 229–246.
Google Scholar
Schapire, R.E. (1989). Pattern languages are not learnable. Unpublished manuscript.
Valiant, L.G. (1984). A theory of the learnable.Communications of the ACM,27, 1134–1142.
Google Scholar

Download references

Author information

Authors and Affiliations

MIT Laboratory for Computer Science, 545 Technology Square, 02139, Cambridge, MA
Robert E. Schapire

Authors

Robert E. Schapire
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schapire, R.E. The strength of weak learnability. Mach Learn 5, 197–227 (1990). https://doi.org/10.1007/BF00116037

Download citation

Issue Date: June 1990
DOI: https://doi.org/10.1007/BF00116037

The strength of weak learnability

Abstract

Article PDF

Similar content being viewed by others

Learning Theory: the Probably Approximately Correct Framework

The no-free-lunch theorems of supervised learning

Boosting conditional probability estimators

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

The strength of weak learnability

Abstract

Article PDF

Similar content being viewed by others

Learning Theory: the Probably Approximately Correct Framework

The no-free-lunch theorems of supervised learning

Boosting conditional probability estimators

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords