Random Forest Pruning Techniques: A Recent Review

Youness Manzali¹ &
Mohamed Elfar¹

857 Accesses
16 Citations
Explore all metrics

Abstract

Random forest is one of the most used machine learning algorithms since its high predictive performance. However, many studies criticize it for the fact that it generates a large number of trees, which requires important storage space and a significant learning time. In addition, the final model induced by RF may contain redundant trees and others that do not contribute to the prediction that may even disadvantage performance. This is why many researchers try to reduce the number of trees in a forest called forest pruning. This article presents a study of the pruning work of random forest classifiers, explains in detail the operating principle of each technique, and cites their advantages and disadvantages. Finally, it compares their classification performance in terms of accuracy, speed of learning, and complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Pruning a Random Forest by Learning a Learning Algorithm

Artificial Intelligence Random Forest Algorithm and the Application

A Review on Random Forest: An Ensemble Classifier

Data Availability

The datasets used during the current study are freely available in the UCI repository.

Code Availability

The code will be available upon request to reviewers.

References

Nilsson NJ (1965) Learning machines
Yu K, Wang L, Yu Y (2020) Ordering-based Kalman filter selective ensemble for classification. IEEE Access 8:9715–9727
Article Google Scholar
Skurichina M, Duin RP (1998) Bagging for linear classifiers. Pattern Recogn 31(7):909–930
Article Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Article Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Article Google Scholar
Freund Y, Schapire RE et al (1996) Experiments with a new boosting algorithm. In: ICML, vol 96. Citeseer, pp 148–156
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Robnik-Sikonja M (2004) Improving random forests. In: European Conference on Machine Learning. Springer, pp 359–370
Tsymbal A, Pechenizkiy M, Cunningham P (2006) Dynamic integration with random forests. In: European Conference on Machine Learning. Springer, pp 801–808
Breitenbach M, Nielsen R, Grudic GZ (2003) Probabilistic random forests: Predicting data point specific misclassification probabilities. University of Colorado at Boulder, Tech. Rep. CU-CS-954-03
Biau G, Devroye L, Lugosi G (2008) Consistency of random forests and other averaging classifiers. J Mach Learn Res 9(9)
Kulkarni VY, Sinha PK (2012) Pruning of random forest classifiers: A survey and future directions. In: 2012 International Conference on Data Science & Engineering (ICDSE). IEEE, pp 64–68
Shaik AB, Srinivasan S (2019) A brief survey on random forest ensembles in classification model. In: International Conference on Innovative Computing and Communications. Springer, pp 253–260
Hu R, Zhou S, Liu Y, Tang Z (2019) Margin-based pareto ensemble pruning: an ensemble pruning algorithm that learns to search optimized ensembles. Comput Intell Neurosci 2019
Martinez WG (2021) Ensemble pruning via quadratic margin maximization. IEEE Access 9:48931–48951
Article Google Scholar
Chung D, Kim H (2015) Accurate ensemble pruning with Pl-bagging. Comput Stat Data Anal 83:1–13
Article Google Scholar
Jiang Z-Q, Shen X-J, Gou J-P, Wang L, Zha Z-J (2019) Dynamically building diversified classifier pruning ensembles via canonical correlation analysis. Multimed Tools Appl 78(1):271–288
Article Google Scholar
Zhang H, Song Y, Jiang B, Chen B, Shan G (2019) Two-stage bagging pruning for reducing the ensemble size and improving the classification performance. Math Probl Eng 2019
Croux C, Joossens K, Lemmens A (2007) Trimmed bagging. Comput Stat Data Anal 52(1):362–368
Article Google Scholar
Ni Z, Xia P, Zhu X, Ding Y, Ni L (2020) A novel ensemble pruning approach based on information exchange glowworm swarm optimization and complementarity measure. J Intell Fuzzy Syst 39(6):8299–8313
Article Google Scholar
Guo H, Liu H, Li R, Wu C, Guo Y, Xu M (2018) Margin & diversity based ordering ensemble pruning. Neurocomputing 275:237–246
Article Google Scholar
Zhu X, Ni Z, Ni L, Jin F, Cheng M, Li J (2019) Improved discrete artificial fish swarm algorithm combined with margin distance minimization for ensemble pruning. Comput Ind Eng 128:32–46
Article Google Scholar
Nguyen TT, Luong AV, Dang MT, Liew AW-C, McCall J (2020) Ensemble selection based on classifier prediction confidence. Pattern Recogn 100:107104
Fawagreh K, Gaber MM, Elyan E (2015) Club-DRF: a clustering approach to extreme pruning of random forests. In: International Conference on Innovative Techniques and Applications of Artificial Intelligence. Springer, pp 59–73
Zhang H, Cao L (2014) A spectral clustering based ensemble pruning approach. Neurocomputing 139:289–297
Article Google Scholar
Lustosa Filho JAS, Canuto AM, Santiago RHN (2018) Investigating the impact of selection criteria in dynamic ensemble selection methods. Expert Syst Appl 106:141–153
Article Google Scholar
Zouggar ST, Adla A (2019) A diversity-accuracy measure for homogenous ensemble selection. International Journal of Interactive Multimedia & Artificial Intelligence 5(5)
Bader-El-Den M, Gaber M (2012) GARF: towards self-optimised random forests. In: International Conference on Neural Information Processing. Springer, pp 506–515
Adnan MN, Islam MZ (2016) Optimizing the number of trees in a decision forest to discover a subforest with high ensemble accuracy using a genetic algorithm. Knowl-Based Syst 110:86–97
Article Google Scholar
Daho MEH, Settouti N, Bechar MEA, Boublenza A, Chikh MA (2021) A new correlation-based approach for ensemble selection in random forests. Int J Intell Comput Cybern
Souad TZ, Abdelkader A (2019) Pruning of random forests: a diversity-based heuristic measure to simplify a random forest ensemble. INFOCOMP: J Comput Sci 18(1)
Khan Z, Gul A, Perperoglou A, Miftahuddin M, Mahmoud O, Adler W, Lausen B (2020) Ensemble of optimal trees, random forest and random projection ensemble classification. Adv Data Anal Classif 14(1):97–116
Article Google Scholar
Dheenadayalan K, Srinivasaraghavan G, Muralidhara V (2016) Pruning a random forest by learning a learning algorithm. In: International Conference on Machine Learning and Data Mining in Pattern Recognition. Springer, pp 516–529
Giffon L, Lamothe C, Bouscarrat L, Milanesi P, Cherfaoui F, Koço S (2020) Pruning random forest with orthogonal matching trees
Jiang X, Wu C-A, Guo H (2017) Forest pruning based on branch importance. Comput Intell Neurosci 2017
Fawagreh K, Gaber MM (2020) egap: an evolutionary game theoretic approach to random forest pruning. Big Data Cogn Comput 4(4):37
Article Google Scholar
Ren S, Cao X, Wei Y, Sun J (2015) Global refinement of random forest. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 723–730
Narassiguin A, Elghazel H, Aussem A (2016) Similarity tree pruning: a novel dynamic ensemble selection approach. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW). IEEE, pp 1243–1250
Rodriguez-Fdez I, Canosa A, Mucientes M, Bugarin A (2015) STAC: a web platform for the comparison of algorithms using statistical tests. In: 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE, pp 1–8

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Laboratory of Applied Physics, Computer Science and Statistics, Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
Youness Manzali & Mohamed Elfar

Authors

Youness Manzali
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Elfar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors confirm their contribution to the paper as follows:

• Study conception and design: Y. Manzali, M. El far

• Analysis and interpretation of results: M. El far

• Draft manuscript preparation: Y. Manzali

• Critical revision of the article: M.El far

All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Youness Manzali.

Ethics declarations

Ethical Approval

Not applicable

Consent to Participate

Not applicable

Consent for Publication

All authors of the manuscript have agreed for authorship, read and approved the manuscript, and given consent for the submission of the manuscript.

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Manzali, Y., Elfar, M. Random Forest Pruning Techniques: A Recent Review. Oper. Res. Forum 4, 43 (2023). https://doi.org/10.1007/s43069-023-00223-6

Download citation

Received: 27 January 2023
Accepted: 20 April 2023
Published: 19 May 2023
DOI: https://doi.org/10.1007/s43069-023-00223-6

Random Forest Pruning Techniques: A Recent Review

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Pruning a Random Forest by Learning a Learning Algorithm

Artificial Intelligence Random Forest Algorithm and the Application

A Review on Random Forest: An Ensemble Classifier

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Random Forest Pruning Techniques: A Recent Review

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Pruning a Random Forest by Learning a Learning Algorithm

Artificial Intelligence Random Forest Algorithm and the Application

A Review on Random Forest: An Ensemble Classifier

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation