Abstract
Binarization techniques deal with multiclass classification problem combining several binary classifiers. They were originally introduced for dealing with multiclass problems with methods that were only able to deal with two classes (e.g., SVM). Nevertheless, it has been shown that they can also be useful with classification methods able to deal directly with multiclass problems (e.g., decision trees), because they can improve the results. This work studies if this improvement is also possible when using ensembles of decision trees (e.g., Random Forest, Boosting) over 67 multiclass datasets. It was found that some combinations of a binarization technique and an ensemble method improve the results of the ensemble method without binarization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
References
Angulo, C., Parra, X., Catala, A.: K-SVCR: a support vector machine for multi-class classification. Neurocomputing 55(1), 57–77 (2003)
Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Bagheri, M.A., Gao, Q., Escalera, S.: A framework towards the unification of ensemble classification methods. In: 2013 12th International Conference on Machine Learning and Applications (ICMLA), vol. 2, pp. 351–355. IEEE, December 2013
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Dietterich, T.G.: Approximate statistical test for comparing supervised classification learning algorithms. Neural Comput. 10(7), 1895–1923 (1998)
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. J. Artif. Intell. Res. 2, 263–286 (1995)
Elomaa, T., Kääriäinen, M.: An analysis of reduced error pruning. J. Artif. Intell. Res. 15, 163–187 (2001)
Escalera, S., Pujol, O., Radeva, P.: On the decoding process in ternary error-correcting output codes. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 120–134 (2010)
Fernández, A., López, V., Galar, M., del Jesus, M.J., Herrera, F.: Analysing the classification of imbalanced data-sets with multiple classes: binarization techniques and ad-hoc approaches. Knowl.-Based Syst. 42, 97–110 (2013)
Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15, 3133–3181 (2014)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: 13th International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Ann. Stat. 95(2), 337–407 (2000)
Fürnkranz, J.: Round robin ensembles. Intell. Data Anal. 7(5), 385–403 (2003)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recogn. 44(8), 1761–1776 (2011)
Galar, M., Fernández, A., Barrenechea, E., Herrera, F.: DRCW-OVO: distance-based relative competence weighting combination for one-vs-one strategy in multi-class problems. Pattern Recogn. 48, 28–42 (2014)
Garcia-Pedrajas, N., Ortiz-Boyer, D.: Improving multiclass pattern recognition by the combination of two strategies. IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 1001–1006 (2006)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993). Machine Learning
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
Rodríguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)
Sesmero, M.P., Alonso-Weber, J.M., Gutierrez, G., Ledezma, A., Sanchis, A.: An ensemble approach of dual base learners for multi-class classification problems. Inf. Fusion 24, 122–136 (2015)
Wang, S., Yao, X.: Multiclass imbalance problems: analysis and potential solutions. IEEE Trans. Syst. Man Cybern. Part B. Cybern. 42(4), 1119–1130 (2012)
Webb, G.I.: Multiboosting: a technique for combining boosting and wagging. Mach. Learn. 40(2), 159–196 (2000)
Windeatt, T., Ghaderi, R.: Coding and decoding strategies for multi-class learning problems. Inf. Fusion 4(1), 11–21 (2003)
Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithms. CRC Press, Boca Raton (2012)
Acknowledgments
This work was partially supported by the project TIN2011-24046 of the Spanish Ministry of Economy and Competitiveness.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Rodríguez, J.J., Díez-Pastor, J.F., Arnaiz-González, Á., García-Osorio, C. (2015). An Experimental Study on Combining Binarization Techniques and Ensemble Methods of Decision Trees. In: Schwenker, F., Roli, F., Kittler, J. (eds) Multiple Classifier Systems. MCS 2015. Lecture Notes in Computer Science(), vol 9132. Springer, Cham. https://doi.org/10.1007/978-3-319-20248-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-20248-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20247-1
Online ISBN: 978-3-319-20248-8
eBook Packages: Computer ScienceComputer Science (R0)