Feature-subspace aggregating: ensembles for stable and unstable learners

Kai Ming Ting¹,
Jonathan R. Wells¹,
Swee Chuan Tan¹,
Shyh Wei Teng¹ &
…
Geoffrey I. Webb²

2876 Accesses
3 Altmetric
Explore all metrics

Abstract

This paper introduces a new ensemble approach, Feature-Subspace Aggregating (Feating), which builds local models instead of global models. Feating is a generic ensemble approach that can enhance the predictive performance of both stable and unstable learners. In contrast, most existing ensemble approaches can improve the predictive performance of unstable learners only. Our analysis shows that the new approach reduces the execution time to generate a model in an ensemble through an increased level of localisation in Feating. Our empirical evaluation shows that Feating performs significantly better than Boosting, Random Subspace and Bagging in terms of predictive accuracy, when a stable learner SVM is used as the base learner. The speed up achieved by Feating makes feasible SVM ensembles that would otherwise be infeasible for large data sets. When SVM is the preferred base learner, we show that Feating SVM performs better than Boosting decision trees and Random Forests. We further demonstrate that Feating also substantially reduces the error of another stable learner, k-nearest neighbour, and an unstable learner, decision tree.

Article PDF

The deterministic subspace method for constructing classifier ensembles

Article Open access 03 October 2017

Ensemble Enhanced Evidential k-NN Classifier Through Random Subspaces

Building Weighted Classifier Ensembles Through Classifiers Pruning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Asuncion, A., & Newman, D. J. (2007). UCI repository of machine learning databases. University of California, Irvine, CA.
Atkeson, C. G., Moore, A. W., & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11, 11–73.
Article Google Scholar
Bingham, E., & Mannila, H. (2001). Random projection in dimensionality reduction: applications to image and text data. In Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining (pp. 245–250). New York: ACM.
Chapter Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
MathSciNet MATH Google Scholar
Breiman, L. (1998). Arcing classifiers (with discussion). Annals of Statistics, 26(3), 801–849.
Article MathSciNet MATH Google Scholar
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32
Article MATH Google Scholar
Cerquides, J., & Mantaras, R. L. D. (2005). Robust Bayesian linear classifier ensembles. In Proceedings of the sixteenth European conference on machine learning (pp. 70–81). Berlin: Springer.
Google Scholar
Davidson, I. (2004). An ensemble technique for stable learners with performance bounds. In Proceedings of the thirteenth national conference on artificial intelligence (pp. 330–335). Menlo Park: AAAI Press.
Google Scholar
DePasquale, J., & Polikar, O. (2007). Random feature subset selection for ensemble based classification of data with missing features. In Lecture notes in computer science: Vol. 4472. Multiple classifier systems (pp. 251–260). Berlin: Springer.
Chapter Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., & Lin, C.-J. (2008). LIBLINEAR: a library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.
Google Scholar
Frank, E., Hall, M., & Pfahringer, B. (2003). Locally weighted Naive Bayes. In Proceedings of the 19th conference on uncertainty in artificial intelligence (pp. 249–256). San Mateo: Morgan Kaufmann.
Google Scholar
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions Pattern Analysis and Machine Intelligence, 20(8), 832–844.
Article Google Scholar
Kim, H.-C., Pang, S., Je, H.-M., Kim, D., & Bang, S.-Y. (2002). Support vector machine ensemble with bagging. In Lecture notes in computer science: Vol. 2388. Pattern recognition with support vector machines (pp. 131–141). Berlin: Springer.
Google Scholar
Klanke, S., Vijayakumar, S., & Schaal, S. (2008). A library for local weighted projection regression. Journal of Machine Learning Research, 9, 623–626.
MathSciNet Google Scholar
Kohavi, R. (1996). Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In Proceedings of the 2nd international conference on knowledge discovery and data mining (pp. 202–207). New York: ACM.
Google Scholar
Kohavi, R., & Li, C. H. (1995). Oblivious decision trees, graphs, and top-down pruning. In Proceedings of the 14th international joint conference on artificial intelligence (pp. 1071–1077). San Mateo: Morgan Kaufmann.
Google Scholar
Li, X., Wang, L., & Sung, E. (2005). A study of AdaBoost with SVM based weak learners. In Proceedings of the international joint conference on neural networks (pp. 196–201). New York: IEEE Press.
Google Scholar
Liu, F. T., Ting, K. M., Yu, Y., & Zhou, Z. H. (2008). Spectrum of variable-random trees. Journal of Artificial Intelligence Research, 32, 355–384.
MATH Google Scholar
Opitz, D. (1999). Feature selection for ensembles. In Proceedings of the 16th national conference on artificial intelligence (pp. 379–384). Menlo Park: AAAI Press.
Google Scholar
Oza, N. C., & Tumer, K. (2001). Input decimation ensembles: decorrelation through dimensionality reduction. In LNCS: Vol. 2096. Proceedings of the second international workshop on multiple classifier systems (pp. 238–247). Berlin: Springer.
Chapter Google Scholar
Pavlov, D., Mao, J., & Dom, B. (2000). Scaling-up support vector machines using the boosting algorithm. In Proceedings of the 15th international conference on pattern recognition (pp. 219–222). Los Alamitos: IEEE Comput. Soc.
Chapter Google Scholar
Quinlan, J. R. (1993). C4.5: program for machine learning. San Mateo: Morgan Kaufmann.
Google Scholar
Schapire, R. E., & Singer, S. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37, 297–336.
Article MATH Google Scholar
Tao, D., Tang, X., Li, X., & Wu, X. (2006). Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(7), 1088–1099.
Article Google Scholar
Tsang, I. W., Kwok, J. T., & Cheung, P.-M. (2005). Core vector machines: fast SVM training on very large data sets. Journal of Machine Learning Research, 6, 363–392.
MathSciNet Google Scholar
Tsang, I. W., Kocsor, A., & Kwok, J. T. (2007). Simpler core vector machines with enclosing balls. In Proceedings of the twenty-fourth international conference on machine learning (pp. 911–918). San Mateo: Morgan Kaufmann.
Google Scholar
Webb, G. I., Boughton, J., & Wang, Z. (2005). Not so naive Bayes: averaged one-dependence estimators. Machine Learning, 58(1), 5–24.
Article MATH Google Scholar
Witten, I. H., & Frank, E. (2005). Data mining: practical machine learning tools and techniques (2nd edn.). San Mateo: Morgan Kaufmann.
MATH Google Scholar
Yang, Y., Webb, G. I., Cerquides, J., Korb, K., Boughton, J., & Ting, K. M. (2007). To select or to weigh: a comparative study of linear combination schemes for superparent-one-dependence ensembles. IEEE Transaction on Knowledge and Data Engineering, 19(12), 1652–1665.
Article Google Scholar
Yu, H.-F., Hsieh, C.-J., Chang, K.-W., & Lin, C.-J. (2010). Large linear classification when data cannot fit in memory. In Proceedings of the sixteenth ACM SIGKDD conference on knowledge discovery and data mining (pp. 833–842). New York: ACM.
Chapter Google Scholar
Zheng, Z., & Webb, G. I. (2000). Lazy learning of Bayesian rules. Machine Learning, 41(1), 53–84.
Article Google Scholar
Zheng, F., & Webb, G. I. (2006). Efficient lazy elimination for averaged one-dependence estimators. In Proceedings of the twenty-third international conference on machine learning (pp. 1113–1120). San Mateo: Morgan Kaufmann.
Google Scholar

Download references

Author information

Authors and Affiliations

Gippsland School of Information Technology, Monash University, Vic, 3842, Australia
Kai Ming Ting, Jonathan R. Wells, Swee Chuan Tan & Shyh Wei Teng
Clayton School of Information Technology, Monash University, Vic, 3800, Australia
Geoffrey I. Webb

Authors

Kai Ming Ting
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan R. Wells
View author publications
You can also search for this author in PubMed Google Scholar
Swee Chuan Tan
View author publications
You can also search for this author in PubMed Google Scholar
Shyh Wei Teng
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey I. Webb
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Ming Ting.

Additional information

Editor: Mark Craven.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Ting, K.M., Wells, J.R., Tan, S.C. et al. Feature-subspace aggregating: ensembles for stable and unstable learners. Mach Learn 82, 375–397 (2011). https://doi.org/10.1007/s10994-010-5224-5

Download citation

Received: 26 March 2009
Revised: 01 October 2010
Accepted: 17 October 2010
Published: 18 November 2010
Issue Date: March 2011
DOI: https://doi.org/10.1007/s10994-010-5224-5

Feature-subspace aggregating: ensembles for stable and unstable learners

Abstract

Article PDF

Similar content being viewed by others

The deterministic subspace method for constructing classifier ensembles

Ensemble Enhanced Evidential k-NN Classifier Through Random Subspaces

Building Weighted Classifier Ensembles Through Classifiers Pruning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Feature-subspace aggregating: ensembles for stable and unstable learners

Abstract

Article PDF

Similar content being viewed by others

The deterministic subspace method for constructing classifier ensembles

Ensemble Enhanced Evidential k-NN Classifier Through Random Subspaces

Building Weighted Classifier Ensembles Through Classifiers Pruning

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords