article

Randomizing Outputs to Increase Prediction Accuracy

Author:

Leo BreimanAuthors Info & Claims

Machine Learning, Volume 40, Issue 3

Pages 229 - 242

https://doi.org/10.1023/A:1007682208299

Published: 01 September 2000 Publication History

Publisher Site

Abstract

Bagging and boosting reduce error by changing both the inputs and outputs to form perturbed training sets, growing predictors on these perturbed training sets and combining them. An interesting question is whether it is possible to get comparable performance by perturbing the outputs alone. Two methods of randomizing outputs are experimented with. One is called output smearing and the other output flipping. Both are shown to consistently do better than bagging.

References

[1]

An, G. (1996). The effects of adding noise during backpropagation training on generalization performance. Neural Computation, 6, 643-674.

Digital Library

Google Scholar

[2]

Breiman, L. (1996a). Bagging predictors. Machine Learning, 26(2), 123-140.

Crossref

Google Scholar

[3]

Breiman, L. (1996b). The heuristics of instability in model selection. Annals of Statistics, 24, 2350-2383.

Crossref

Google Scholar

[4]

Breiman, L. (1997). Prediction games and arcing algorithms. Technical Report 504, Statistics Department, University of California at Berkeley. Available at www.stat.berkeley.edu

Google Scholar

[5]

Breiman, L. (1998a). Arcing classifiers (with discussion). Annals of Statistics, 26, 801-849.

Crossref

Google Scholar

[6]

Breiman, L. (1998b). Half and half bagging and hard boundary points. Technical Report 534, Statistics Dept. Univ. of Calif. at Berkeley.

Google Scholar

[7]

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees. Chapman and Hall.

Google Scholar

[8]

Dietterich, T. (1998). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning, 1-22.

Digital Library

Google Scholar

[9]

Freund, Y. & Schapire, R. (1997). A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139.

Digital Library

Google Scholar

[10]

Freund, Y. & Schapire, R. (1996). Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference (pp. 148-156).

Google Scholar

[11]

Freund, Y. & Schapire, R. (in press). Discussion of "Arcing Classifiers" by L. Breiman. Annals of Statistics.

Google Scholar

[12]

Friedman, J. (1991). Multivariate adaptive regression splines (with discussion). Annals of Statistics, 19, 1-141.

Crossref

Google Scholar

[13]

Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1-58.

Digital Library

Google Scholar

[14]

Grossamn, T. & Lapedes, A. (1993). Use of bad training data for better predictions. NIPS, 6, 343-350.

Google Scholar

Cited By

View all

Sakib MMustajab SAlam M(2025)Ensemble deep learning techniques for time series analysis: a comprehensive review, applications, open issues, challenges, and future directionsCluster Computing10.1007/s10586-024-04684-028:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s10586-024-04684-0
Guo HDeng S(2024)Condition Identification of Calcining Kiln Based on Fusion Machine Learning and Semantic WebInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.36520321:1(1-36)Online publication date: 10-Dec-2024
https://dl.acm.org/doi/10.4018/IJSWIS.365203
Li DXu JYang ZTang C(2024)Train Once, Locate Anytime for Anyone: Adversarial Learning-based Wireless LocalizationACM Transactions on Sensor Networks10.1145/361409520:2(1-21)Online publication date: 10-Jan-2024
https://dl.acm.org/doi/10.1145/3614095
Show More Cited By

Index Terms

Randomizing Outputs to Increase Prediction Accuracy
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
      2. Neural networks
2. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Trees

Recommendations

Vertical Ensemble Co-Training for Text Classification
Regular Papers

High-quality, labeled data is essential for successfully applying machine learning methods to real-world text classification problems. However, in many cases, the amount of labeled data is very small compared to that of the unlabeled, and labeling ...
Faster randomized consensus with an oblivious adversary

Two new algorithms are given for randomized consensus in a shared-memory model with an oblivious adversary. Each is based on a new construction of a conciliator, an object that guarantees termination and validity, but that only guarantees agreement with ...
MULFE: Multi-Label Learning via Label-Specific Feature Space Ensemble
In multi-label learning, label correlations commonly exist in the data. Such correlation not only provides useful information, but also imposes significant challenges for multi-label learning. Recently, label-specific feature embedding has been proposed ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Machine Language Volume 40, Issue 3

Sept. 2000

96 pages

ISSN:0885-6125

Editor:
Robert C. Holte
Univ. of Ottawa, Ottawa, Canada

Issue’s Table of Contents

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 September 2000

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

57
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Sakib MMustajab SAlam M(2025)Ensemble deep learning techniques for time series analysis: a comprehensive review, applications, open issues, challenges, and future directionsCluster Computing10.1007/s10586-024-04684-028:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s10586-024-04684-0
Guo HDeng S(2024)Condition Identification of Calcining Kiln Based on Fusion Machine Learning and Semantic WebInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.36520321:1(1-36)Online publication date: 10-Dec-2024
https://dl.acm.org/doi/10.4018/IJSWIS.365203
Li DXu JYang ZTang C(2024)Train Once, Locate Anytime for Anyone: Adversarial Learning-based Wireless LocalizationACM Transactions on Sensor Networks10.1145/361409520:2(1-21)Online publication date: 10-Jan-2024
https://dl.acm.org/doi/10.1145/3614095
Yang XSong ZKing IXu Z(2023)A Survey on Deep Semi-Supervised LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322021935:9(8934-8954)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1109/TKDE.2022.3220219
Gutiérrez-López AGonzález-Serrano FFigueiras-Vidal A(2023)Optimum Bayesian thresholds for rebalanced classification problems using class-switching ensemblesPattern Recognition10.1016/j.patcog.2022.109158135:COnline publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.patcog.2022.109158
Zhang YDeng LZhu HWang WRen ZZhou QLu SSun SZhu ZGorriz JWang S(2023)Deep learning in food category recognitionInformation Fusion10.1016/j.inffus.2023.10185998:COnline publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1016/j.inffus.2023.101859
Vasilev IPetrovskiy MMashechkin IPankratyeva L(2022)Predicting COVID-19-Induced Lung Damage Based on Machine Learning MethodsProgramming and Computing Software10.1134/S036176882204006548:4(243-255)Online publication date: 18-Jul-2022
https://dl.acm.org/doi/10.1134/S0361768822040065
Wang GWang JHe K(2022)Majority-to-minority resampling for boosting-based classification under imbalanced dataApplied Intelligence10.1007/s10489-022-03585-253:4(4541-4562)Online publication date: 11-Jun-2022
https://dl.acm.org/doi/10.1007/s10489-022-03585-2
Li DZhu XSong L(2022)Mutual match for semi-supervised online evolutive learningApplied Intelligence10.1007/s10489-022-03564-753:3(3336-3350)Online publication date: 28-May-2022
https://dl.acm.org/doi/10.1007/s10489-022-03564-7
Mantas CCastellano JMoral-García SAbellán J(2022)A comparison of random forest based algorithms: random credal random forest versus oblique random forestSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-018-3628-523:21(10739-10754)Online publication date: 11-Mar-2022
https://dl.acm.org/doi/10.1007/s00500-018-3628-5
Show More Cited By

Abstract

References

Cited By

Index Terms

Recommendations

Vertical Ensemble Co-Training for Text Classification

Faster randomized consensus with an oblivious adversary

MULFE: Multi-Label Learning via Label-Specific Feature Space Ensemble

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations