Abstract
Bagging and boosting reduce error by changing both the inputs and outputs to form perturbed training sets, growing predictors on these perturbed training sets and combining them. An interesting question is whether it is possible to get comparable performance by perturbing the outputs alone. Two methods of randomizing outputs are experimented with. One is called output smearing and the other output flipping. Both are shown to consistently do better than bagging.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
An, G. (1996). The effects of adding noise during backpropagation training on generalization performance. Neural Computation, 6, 643–674.
Breiman, L. (1996a). Bagging predictors. Machine Learning, 26(2), 123–140.
Breiman, L. (1996b). The heuristics of instability in model selection. Annals of Statistics, 24, 2350–2383.
Breiman, L. (1997). Prediction games and arcing algorithms. Technical Report 504, Statistics Department, University of California at Berkeley. Available at www.stat.berkeley.edu
Breiman, L. (1998a). Arcing classifiers (with discussion). Annals of Statistics, 26, 801–849.
Breiman, L. (1998b). Half and half bagging and hard boundary points. Technical Report 534, Statistics Dept. Univ. of Calif. at Berkeley.
Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees. Chapman and Hall.
Dietterich, T. (1998). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning, 1–22.
Freund, Y. & Schapire, R. (1997). A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.
Freund, Y. & Schapire, R. (1996). Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference (pp. 148–156).
Freund, Y. & Schapire, R. (in press). Discussion of “Arcing Classifiers” by L. Breiman. Annals of Statistics.
Friedman, J. (1991). Multivariate adaptive regression splines (with discussion). Annals of Statistics, 19, 1–141.
Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–58.
Grossamn, T. & Lapedes, A. (1993). Use of bad training data for better predictions. NIPS, 6, 343–350.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Breiman, L. Randomizing Outputs to Increase Prediction Accuracy. Machine Learning 40, 229–242 (2000). https://doi.org/10.1023/A:1007682208299
Issue Date:
DOI: https://doi.org/10.1023/A:1007682208299