Abstract
This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model. A concept class islearnable (orstrongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class isweakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent.
A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error ο.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Angluin, D. (1980). Finding patterns common to a set of strings.J. of Computer and System Sciences,21, 46–62.
Angluin, D. (1988). Queries and concept learning.Machine Learning,2, 319–342.
Angluin, D. and Valiant, L.G. (1979). Fast probabilistic algorithms for Hamiltonian circuits and matchings.J. Computer and System Sciences,18, 155–193.
Baum, E.B. (1989). On learning a union of half spaces. Unpublished manuscript.
Blumer, A., Ehrenfeucht, a., Haussler, D., and Warmuth, M.K. (1987). Occam's razor.Information Processing Letters,24, 377–380.
Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M.K. (1989). Learnability and the Vapnik-Chervonenkis dimension.J. of the Association for Computing Machinery,36, 929–965.
Board, R. and Pitt, L. (1990). On the necessity of Occam algorithms. (In press)Proceedings of the Twenty-second Annual ACM Symposium on Theory of Computing. New York, NY: ACM Press.
Boucheron, S. and Sallantin, J. (1988). Some remarks about space-complexity of learning, and circuit complexity of recognizing.Proceedings of the 1988 Workshop on Computational Learning Theory (pp. 125–138). San Mateo, CA: Morgan Kaufman.
Ehrenfeucht, A. and Haussler, D. (1989). Learning decision trees from random examples.Information and Computation,3, 231–246.
Floyd, S. (1989). Space-bounded learning and the Vapnik-Chervonenkis dimension.Proceedings of the Second Annual Workshop on Computational Learning Theory (pp. 349–364). San Mateo, CA: Morgan Kaufman.
Haussler, D. (1988).Space efficient learning algorithms (Technical Report UCSC-CRL-88–2). Santa Cruz, CA: University of California, Baskin Center for Computer Engineering and Information Sciences.
Haussler, D., Kearns, M., Littlestone, N., and Warmuth, M.K. (1988). Equivalence of models for polynomial learnability.Proceedings of the 1988 Workshop on Computational Learning Theory (pp. 42–55). San Mateo, CA: Morgan Kaufman.
Haussler, D., Littlestone, N., and Warmuth, M.K. (1987). Expected mistake bounds for on-line learning algorithms. Unpublished manuscript.
Haussler, D., Littlestone, N., and Warmuth, M.K. (1988). Predicting {0, 1}-functions on randomly drawn points.Proceedings of the Twenty-Ninth Annual Symposium on Foundations of Computer Science (pp. 100–109). Washington, DC: IEEE Computer Society Press.
Helmbold, D., Sloan, R., and Warmuth, M.K. (1990). Learning nested differences of intersection-closed concept classes.Machine Learning, 5, xxx-xxx.
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables.J. of the American Statistical Association,58, 13–30.
Kearns, M. (1988). Thoughts on hypothesis boosting. Unpublished manuscript.
Kearns, M. (1989).The Computational Complexity of Machine Learning. Doctoral dissertation, Department of Computer Science, Harvard University, Cambridge, MA.
Kearns, M., Li, M., Pitt, L., and Valiant, L. (1987). On the learnability of Boolean formulae.Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (pp. 285–295). New York, NY: ACM Press.
Kearns, M. and Valiant, L.G. (1988).Learning Boolean formulae or finite automata is as hard as factoring (Technical Report TR-14–88). Cambridge, MA: Harvard University Aiken Computation Laboratory.
Kearns, M. and Valiant, L.G. (1989). Cryptographic limitations on learning Boolean formulae and finite automata.Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing (pp. 433–444). New York, NY: ACM Press.
Pitt, L. and Valiant, L.G. (1988). Computational limitations on learning from examples.J. of the Association for Computing Machinery,35, 965–984.
Rivest, R.L. (1987). Learning decision lists.Machine Learning,2, 229–246.
Schapire, R.E. (1989). Pattern languages are not learnable. Unpublished manuscript.
Valiant, L.G. (1984). A theory of the learnable.Communications of the ACM,27, 1134–1142.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Schapire, R.E. The strength of weak learnability. Mach Learn 5, 197–227 (1990). https://doi.org/10.1007/BF00116037
Issue Date:
DOI: https://doi.org/10.1007/BF00116037