More Web Proxy on the site http://driver.im/

article

Using Correspondence Analysis to Combine Classifiers

Author:

Christopher J. MerzAuthors Info & Claims

Machine Learning, Volume 36, Issue 1-2

Pages 33 - 58

https://doi.org/10.1023/A:1007559205422

Published: 01 July 1999 Publication History

Abstract

Several effective methods have been developed recently for improving predictive performance by generating and combining multiple learned models. The general approach is to create a set of learned models either by applying an algorithm repeatedly to different versions of the training data, or by applying different learning algorithms to the same data. The predictions of the models are then combined according to a voting scheme. This paper focuses on the task of combining the predictions of a set of learned models. The method described uses the strategies of stacking and Correspondence Analysis to model the relationship between the learning examples and their classification by a collection of learned models. A nearest neighbor method is then applied within the resulting representation to classify previously unseen examples. The new algorithm does not perform worse than, and frequently performs significantly better than other combining techniques on a suite of data sets.

References

[1]

Ali, K., & Pazzani, M. (1995). Learning multiple relational rule-based models. In D. Fisher & H. Lenz (Eds.), Learning from data: Artificial intelligence and statistics (Vol. 5). Fort Lauderdale, FL: Springer-Verlag.

[2]

Breiman, L. (1994). Heuristics of instability in model selection (Technical Report). Department of Statistics, University of California at Berkeley.

[3]

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140.

[4]

Chan, P., & Stolfo, S. (1995). A comparative evaluation of voting and meta-learning on partitioned data. Proceedings of the 12th International Conference on Machine Learning (pp. 90-98). Morgan Kaufmann.

[5]

Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3(4), 261-283.

[6]

Cost, S., & Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10(1), 57-78.

[7]

Dietterich, T.G. (1996). Statistical tests for comparing supervised classification learning algorithms (Technical Report). Corvallis, OR: Dept. of Computer Science, Oregeon Statue University.

[8]

Dongarra, J., & Grosse, E. (1998). Netlib repository. http://www.netlib.org/.

[9]

Duda, R., & Hart, P. (1973). Pattern classification and scene analysis. Addison-Wesley.

[10]

Efron, B., & Tibshirani, R. (1993). An introduction to the bootstrap. London and New York: Chapman and Hall.

[11]

Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121(2), 256- 285. Also appeared in COLT90.

Digital Library

[12]

Freund, Y., & Schapire, R.E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. Proceedings of the Second European Conference on Computational Learning Theory (pp. 23-37). Springer-Verlag.

Digital Library

[13]

Freund, Y., & Schapire, R.E. (1996). Experiments with a new boosting algorithm. Proceedings of the 13th International Conference on Machine Learning. Morgan Kaufmann.

[14]

Greenacre, M.J. (1984). Theory and application of correspondence analysis. London: Academic Press.

[15]

Ho, K., Hull, J.J., & Srihari, S.N. (1994). Decision combination in multiple classifier systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-16(1), 66-75.

Digital Library

[16]

Jacobs, R.A., Jordan, M.I., Nowlan, S.J., & Hinton, G.E. (1991). Adaptive mixtures of local experts. Neural Computation, 3(1), 79-87.

[17]

Jordan, M.I., & Jacobs, R.A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6, 181-214.

Digital Library

[18]

Krogh, A., & Vedelsby, J. (1995). Neural network ensembles, cross validation, and active learning. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems (Vol. 7, pp. 231-238). MIT Press.

[19]

Maclin, R., & Shavlik, J.W. (1995). Combining the predictions of multiple classifiers: Using competitive learning to initialize neural networks. Proceedings of the 14th International Joint Conference on Artificial Intelligence.

Digital Library

[20]

Margineantu, D.D., & Dietterich, T.G. (1997). Pruning adaptive boosting. Proceedings of the 14th International Conference on Machine Learning. Morgan Kaufmann.

Digital Library

[21]

Meir, R. (1995). Bias, variance and the combination of least squares estimators. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems (Vol. 7, pp. 295-302). MIT Press.

[22]

Merz, C.J. (1995). Dynamical selection of learning algorithms. In D. Fisher & H. Lenz (Eds.), Learning from data: Artificial intelligence and statistics (Vol. 2). Springer Verlag.

[23]

Merz, C. (1998). Classification and regression by combining models. Ph.D. thesis, University of California, Irvine.

Digital Library

[24]

Merz, C., & Murphy, P. (1996). UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/ MLRepository.html.

[25]

Murthy, S., Kasif, S., Salzberg, S., & Beigel, R. (1993). OC1: Randomized induction of oblique decision trees. Proceedings of AAAI-93. AAAI Press.

[26]

Opitz, D.W., & Shavlik, J.W. (1996). Generating accurate and diverse members of a neural-network ensemble. In D.S. Touretzky, M.C. Mozer, & M.E. Hasselmo (Eds.), Advances in neural information processing systems (Vol. 8, pp. 535-541). MIT Press.

[27]

Perrone, M.P. (1994). Putting it all together: Methods for combining neural networks. In J.D. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems (Vol. 6, pp. 1188-1189). Morgan Kaufmann Publishers.

[28]

Perrone, M.P., & Cooper, L.N. (1993). When networks disagree: Ensemble methods for hybrid neural networks. In R.J. Mammone (Ed.), Artificial neural networks for speech and vision (pp. 126-142). London: Chapman & Hall.

[29]

Press, W.H. (1992). Numerical recipes in C: The art of scientific computing (pp. 59-70). Cambridge University Press.

Digital Library

[30]

Quinlan, R. (1993). C4.5 programs for machine learning. San Mateo, CA: Morgan Kaufmann.

Digital Library

[31]

Quinlan, J.R. (1996). Bagging, boosting, and C4.5. Proceedings of the Fourteenth National Conference on Artificial Intelligence.

[32]

Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal representations by error propagation. In D.E. Rumelhart, J.L. McClelland, & the PDP research group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 1: Foundations). MIT Press.

Digital Library

[33]

Salzberg, S. (1997). On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1(3).

Digital Library

[34]

Schapire, R.E. (1990). The strength of weak learnability. Machine Learning, 5(2), 197-227.

[35]

Shannon, W., & Banks, D. (1997). A distance metric for classification trees. Preliminary Papers of the Sixth International Workshop on Artificial Intelligence and Statistics. Society for Artificial Intelligence and Statistics, Fort Lauderdale, FL.

[36]

Tresp, V., & Taniguchi, M. (1995). Combining estimators using non-constant weighting functions. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems (Vol. 7, pp. 419-426). MIT Press.

[37]

Wolpert, D.H. (1992). Stacked generalization. Neural Networks, 5, 241-259.

Digital Library

Cited By

Li SLiu TTan JZeng DGe S(2023)Trustable Co-Label Learning From Multiple Noisy AnnotatorsIEEE Transactions on Multimedia10.1109/TMM.2021.313775225(1045-1057)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2021.3137752
Bai YTian GKang YJia S(2023)A hybrid ensemble method with negative correlation learning for regressionMachine Language10.1007/s10994-023-06364-3112:10(3881-3916)Online publication date: 23-Aug-2023
https://dl.acm.org/doi/10.1007/s10994-023-06364-3
Idrees MMinku LStahl FBadii A(2022)A heterogeneous online learning ensemble for non-stationary environmentsKnowledge-Based Systems10.1016/j.knosys.2019.104983188:COnline publication date: 21-Apr-2022
https://dl.acm.org/doi/10.1016/j.knosys.2019.104983
Show More Cited By

Index Terms

Using Correspondence Analysis to Combine Classifiers

Recommendations

Learning to combine discriminative classifiers: confidence based
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Much of research in data mining and machine learning has led to numerous practical applications. Spam filtering, fraud detection, and user query-intent analysis has relied heavily on machine learned classifiers, and resulted in improvements in robust ...
Combining classifiers using correspondence analysis
NIPS'97: Proceedings of the 11th International Conference on Neural Information Processing Systems

Several effective methods for improving the performance of a single learning algorithm have been developed recently. The general approach is to create a set of learned models by repeatedly applying the algorithm to different versions of the training data,...
Using query-specific variance estimates to combine Bayesian classifiers
ICML '06: Proceedings of the 23rd international conference on Machine learning

Many of today's best classification results are obtained by combining the responses of a set of base classifiers to produce an answer for the query. This paper explores a novel "query specific" combination rule: After learning a set of simple belief ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Machine Language

Machine Language Volume 36, Issue 1-2

July-August 1999

132 pages

ISSN:0885-6125

Issue’s Table of Contents

Copyright © Copyright © 1999 Kluwer Academic Publishers.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 July 1999

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

62
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li SLiu TTan JZeng DGe S(2023)Trustable Co-Label Learning From Multiple Noisy AnnotatorsIEEE Transactions on Multimedia10.1109/TMM.2021.313775225(1045-1057)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2021.3137752
Bai YTian GKang YJia S(2023)A hybrid ensemble method with negative correlation learning for regressionMachine Language10.1007/s10994-023-06364-3112:10(3881-3916)Online publication date: 23-Aug-2023
https://dl.acm.org/doi/10.1007/s10994-023-06364-3
Idrees MMinku LStahl FBadii A(2022)A heterogeneous online learning ensemble for non-stationary environmentsKnowledge-Based Systems10.1016/j.knosys.2019.104983188:COnline publication date: 21-Apr-2022
https://dl.acm.org/doi/10.1016/j.knosys.2019.104983
Agarwal SChowdary C(2020)A-Stacking and A-BaggingExpert Systems with Applications: An International Journal10.1016/j.eswa.2019.113160146:COnline publication date: 15-May-2020
https://dl.acm.org/doi/10.1016/j.eswa.2019.113160
Kostopoulos GLivieris IKotsiantis STampakas VPatnaik S(2018)CST-VotingJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-16957135:1(99-109)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.3233/JIFS-169571
Nguyen TNguyen MPham XLiew A(2018)Heterogeneous classifier ensemble with fuzzy rule-based meta learnerInformation Sciences: an International Journal10.1016/j.ins.2017.09.009422:C(144-160)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1016/j.ins.2017.09.009
Dong YShen XWang LWornyo DZha Z(2017)Diversity-induced weighted classifier ensemble learning2017 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP.2017.8296478(1232-1236)Online publication date: 17-Sep-2017
https://dl.acm.org/doi/10.1109/ICIP.2017.8296478
Ford DBarik TRand-Pickett LParnin CGraziotin DPrikladnicki RLevy MSarma ASocha D(2017)The tech-talk balanceProceedings of the 10th International Workshop on Cooperative and Human Aspects of Software Engineering10.1109/CHASE.2017.8(43-48)Online publication date: 20-May-2017
https://dl.acm.org/doi/10.1109/CHASE.2017.8
Mert AKılıç NBilgili E(2016)Random subspace method with class separability weightingExpert Systems: The Journal of Knowledge Engineering10.1111/exsy.1214933:3(275-285)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1111/exsy.12149
Kim KLin HChoi JChoi K(2016)A design framework for hierarchical ensemble of multiple feature extractors and multiple classifiersPattern Recognition10.1016/j.patcog.2015.11.00652:C(1-16)Online publication date: 1-Apr-2016
https://dl.acm.org/doi/10.1016/j.patcog.2015.11.006
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents