A Comparative Study on the Performance of Several Ensemble Methods with Low Subsampling Ratio

Zaman Faisal²² &
Hideo Hirose²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5991))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

2051 Accesses

Abstract

In ensemble methods each base learner is trained on a resampled version of the original training sample with the same size. In this paper we have used resampling without replacement or subsampling to train base classifiers with low subsample ratio i.e., the size of each subsample is smaller than the original training sample. The main objective of this paper is to check if the scalability performance of several well known ensemble methods with low subsample ratio are competent and compare them with their original counterpart. We have selected three ensemble methods: Bagging, Adaboost and Bundling. In all the ensemble methods a full decision tree is used as the base classifier. We have applied the subsampled version of the above ensembles in several well known benchmark datasets to check the error rate. We have also checked the time complexity of each ensemble method with low subsampling ratio. From the experiments, it is apparent that in the case of bagging and adaboost with low subsampling ratio for most of the cases the error rate is inversely related with subsample size, while for bundling it is opposite. Overall performance of the ensemble methods with low subsampling ratio from experiments showed that bundling is superior in accuracy with low subsampling ratio in almost all the datasets, while bagging is superior in reducing time complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Ensemble Method Combination: Bagging and Boosting

Classifier Ensembling: Dataset Learning Using Bagging and Boosting

A Robust Ensemble Method for Classification in Imbalanced Datasets in the Presence of Noise

References

Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/mlearn/MLRepository.html
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996a)
MATH MathSciNet Google Scholar
Breiman, L.: Out-of-bag estimation. Statistics Department, University of Berkeley CA 94708, Technical Report (1996b)
Google Scholar
Breiman, L.: Heuristics of instability and stabilization in model selection. Annals of Statistics 24(6), 2350–2383 (1996c)
Article MATH MathSciNet Google Scholar
Bühlman, P.: Bagging, subagging and bragging for improving some prediction algorithms. In: Arkitas, M.G., Politis, D.N. (eds.) Recent Advances and Trends in Nonparametric Statistics, pp. 9–34. Elsevier, Amsterdam (2003)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple datasets. J. Mach. Learn. Research 7, 1–30 (2006)
Google Scholar
Freund, Y., Schapire, R.: Experiments with a New boosting algorithm. In: Machine Learning: Proceedings to the Thirteenth International Conference, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Freund, Y., Schapire, R.: A decision-theoretic generalization of online learning and an application to boosting. J. Comput. System Sci. 55, 119–139 (1997)
Article MATH MathSciNet Google Scholar
Friedman, J.: Stochastic gradient boosting. Comput. Statist. Data Anal. 38, 367–378 (2002)
Article MATH MathSciNet Google Scholar
Friedman, J., Hall, P.: On Bagging and Non-linear Estimation. J. Statist. Planning and Infer. 137(3), 669–683 (2007)
Article MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Freidman, J.: The elements of statistical learning: data mining, inference and prediction. Springer, New York (2001)
MATH Google Scholar
Hothorn, T., Lausen, B.: Double-bagging: combining classifiers by bootstrap aggregation. Pattern Recognition 36(6), 1303–1309 (2003)
Article MATH Google Scholar
Hothorn, T., Lausen, B.: Bundling classifiers by bagging trees. Comput. Statist. Data Anal. 49, 1068–1078 (2005)
Article MATH MathSciNet Google Scholar
Kuncheva, L.I.: Combining Pattern Classifiers. Methods and Algorithms. John Wiley and Sons, Chichester (2004)
Book MATH Google Scholar
Rodríguez, J., Kuncheva, L., Alonso, C.: Rotation forest: A new classifier ensemble method. IEEE Trans. Patt. Analys. Mach. Intell. 28(10), 1619–1630 (2006)
Article Google Scholar
Zaman, F., Hirose, H.: Double SVMbagging: A subsampling approach to SVM ensemble. To appear in Intelligent Automation and Computer Engineering. Springer, Heidelberg (2009)
Google Scholar
Zaman, F., Hirose, H.: Effect of Subsampling Rate on Subbagging and Related Ensembles of Stable Classifiers. In: Chaudhury, S., Mitra, S., Murthy, C.A., Sastry, P.S., Pal, S.K. (eds.) PReMI 2009. LNCS, vol. 5909, pp. 44–49. Springer, Heidelberg (2009)
Chapter Google Scholar
Zhang, C.X., Zhang, J.S., Zhang, G.Y.: An efficient modified boosting method for solving classification problems. J. Comput. Applied Mathemat. 214, 381–392 (2008)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-shi, Fukuoka, Japan
Zaman Faisal & Hideo Hirose

Authors

Zaman Faisal
View author publications
You can also search for this author in PubMed Google Scholar
Hideo Hirose
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Informatics, Wroclaw University of Technology, Str. Wyb. Wyspianskiego 27, 50-370, Poland
Ngoc Thanh Nguyen
Hue University, Str. Le Loi 3, Hue City, Vietnam
Manh Thanh Le
Faculty of Computer Science and Management, Wroclaw University of Technology, Str. Lukasiewicza, 50-370, Wroclaw, Poland
Jerzy Świątek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Faisal, Z., Hirose, H. (2010). A Comparative Study on the Performance of Several Ensemble Methods with Low Subsampling Ratio. In: Nguyen, N.T., Le, M.T., Świątek, J. (eds) Intelligent Information and Database Systems. ACIIDS 2010. Lecture Notes in Computer Science(), vol 5991. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12101-2_33

Download citation

DOI: https://doi.org/10.1007/978-3-642-12101-2_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12100-5
Online ISBN: 978-3-642-12101-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics