Automatic classifier selection for non-experts

Matthias Reif¹,
Faisal Shafait¹,
Markus Goldstein¹,
Thomas Breuel² &
…
Andreas Dengel¹

1848 Accesses
91 Citations
Explore all metrics

Abstract

Choosing a suitable classifier for a given dataset is an important part of developing a pattern recognition system. Since a large variety of classification algorithms are proposed in literature, non-experts do not know which method should be used in order to obtain good classification results on their data. Meta-learning tries to address this problem by recommending promising classifiers based on meta-features computed from a given dataset. In this paper, we empirically evaluate five different categories of state-of-the-art meta-features for their suitability in predicting classification accuracies of several widely used classifiers (including Support Vector Machines, Neural Networks, Random Forests, Decision Trees, and Logistic Regression). Based on the evaluation results, we have developed the first open source meta-learning system that is capable of accurately predicting accuracies of target classifiers. The user provides a dataset as input and gets an automatically created high-performance ready-to-use pattern recognition system in a few simple steps. A user study of the system with non-experts showed that the users were able to develop more accurate pattern recognition systems in significantly less development time when using our system as compared to using a state-of-the-art data mining software.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

References

Abdelmessih SD, Shafait F, Reif M, Goldstein M (2010) Landmarking for meta-learning using RapidMiner. In: RapidMiner community meeting and conference
Ali S, Smith KA (2006) On learning algorithm selection for classification. Applied Soft Comput. 6:119–138
Article Google Scholar
Asuncion A, Newman D UCI machine learning repository (2007) http://www.ics.uci.edu/~mlearn/MLRepository.html University of California, Irvine, School of Information and Computer Sciences
Bensusan H, Giraud-Carrier C (2000) Casa batló is in passeig de gràcia or how landmark performances can describe tasks. In: Proceedings of the ECML-00 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 29–46
Bensusan H, Giraud-Carrier C, Kennedy C (2000) A higher-order approach to meta-learning. In: Proceedings of the ECML’2000 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 109–117
Bensusan H, Giraud-Carrier CG (2000) Discovering task neighbourhoods through landmark learning performances. In: PKDD ’00: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Springer-Verlag, London, UK, pp 325–330
Bensusan H, Kalousis A (2001) Estimating the predictive accuracy of a classifier. In: De Raedt L, Flach P (eds.) Machine Learning: ECML 2001, Lecture Notes in Computer Science, vol. 2167 Springer, Berlin, pp 25–36
Brazdil P, Soares C, da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn. 50(3):251–277
Article MATH Google Scholar
Brazdil PB, Soares C (2000) Zoomed ranking: Selection of classification algorithms based on relevant performance information. In: Proceedings of principles of data mining and knowledge discovery, 4th European conference (PKDD-2000). Springer, pp 126–135
Breiman L (2001) Random forests. Mach Learn. 45:5–32
Article MATH Google Scholar
Chang CC, Lin CJ LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn. 20(3):273–297
MATH Google Scholar
Engels R, Theusinger C (1998) Using a data metric for preprocessing advice for data mining applications. In: Proceedings of the European Conference on artificial intelligence (ECAI-98, Wiley, pp 430–434
Frasch JV, Lodwich A, Shafait F, Breuel TM (2011) A bayes-true data generator for evaluation of supervised and unsupervised learning methods. Pattern Recogn Lett. 32(11):1523–1531
Article Google Scholar
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc. 84(405):165–175
Article Google Scholar
Fürnkranz J, Petrak J (2001) An evaluation of landmarking variants. In: C. Giraud-Carrier, N. Lavrač, S. Moyle, B. Kavšek (eds.) Proceedings of the ECML/PKDD workshop on integrating aspects of data mining, decision support and meta-learning (IDDM-2001), Freiburg, Germany, pp 57–68
Gama J, Brazdil P (1995) Characterization of classification algorithms. In: C. Pinto-Ferreira, N. Mamede (eds.) Progress in artificial intelligence, Lecture Notes in Computer Science, vol. 990, Springer Heidelberg, pp 189–200
Giraud-Carrier C (2005) The data mining advisor: meta-learning at the service of practitioners. In: Proceedings of the fourth international conference on machine learning and applications, 2005, pp 113–119
Hilario M, Nguyen P, Do H, Woznica A, Kalousis A (2011) Ontology-based meta-mining of knowledge discovery workflows. In: Jankowski N, Duch W, Grąbczewski K (eds.) Meta-Learning in Computational Intelligence, Studies in Computational Intelligence, vol. 358, Springer Heidelberg, pp 273–315
John GH, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: International Conference on machine learning, Morgan Kaufmann, pp 121–129
Kalousis A, Hilario M (2001) Feature selection for meta-learning. In: Cheung D, Williams G, Li Q (eds.) Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, vol. 2035, Springer Heidelberg, pp 222–233
Kietz JU, Serban F, Bernstein A, Fischer S (2010) Data mining workflow templates for intelligent discovery assistance and auto-experimentation. In: Proceedings of the ECML/PKDD-10 Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery, pp 1–12
King RD, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell. 9(3):289–333
Article Google Scholar
Köpf C, Taylor C, Keller J (2000) Meta-analysis: from data characterisation for meta-learning to meta-regression. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP
Lindner G, Studer R (1999) Ast: support for algorithm selection with a cbr approach. In: Recent Advances in Meta-Learning and Future Work, pp 418–423
Esprit project METAL (#26.357): A meta-learning assistant for providing user support in data mining and machine learning (1999–2002). http://www.ofai.at/research/impml/metal/
Michie D, Spiegelhalter D, Taylor C (1994) Machine Learning, Neural & Statistical Classification. Ellis Horwood, Chichester
Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) Yale: Rapid prototyping for complex data mining tasks. In: Ungar L, Craven M, Gunopulos D, Eliassi-Rad T (eds.) KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, pp 935–940
Peng Y, Flach P, Soares C, Brazdil P (2002) Improved dataset characterisation for meta-learning. In: S. Lange, K. Satoh, C. Smith (eds.) Discovery Science, Lecture Notes in Computer Science, vol. 2534, Springer, Heidelberg, pp 193–208
Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. In: In Proceedings of the Seventeenth international conference on machine learning, Morgan Kaufmann, pp 743–750
Piatetsky-Shapiro G (2010) Data mining / analytic tools used poll http://www.kdnuggets.com/polls/2010/data-mining-analytics-tools.html
Qiao Z, Zhou L, Huang JZ (2009) Sparse linear discriminant analysis with applications to high dimensional low sample size data. IAENG Int J Appl Math. 39(1):48–60
MATH MathSciNet Google Scholar
Quinlan JR Cubist. http://www.rulequest.com/cubist-info.html
Quinlan JR (1992) Learning with continuous classes. In Proceedings AI’92, pp. 343–348
Quinlan R (2002) Data mining tools see5 and c5.0. http://www.rulequest.com/see5-info.html
Rendell L, Cho H (1990) Empirical learning as a function of concept character. Mach Learn. 5:267–298
Google Scholar
Rice JR (1976) The algorithm selection problem. Adv Comput. 15:65–118
Google Scholar
Segrera S, Pinho J, Moreno M (2008) Information-theoretic measures for meta-learning. In: Corchado E, Abraham A, Pedrycz W (eds.) Hybrid Artificial Intelligence Systems, Lecture Notes in Computer Science, vol. 5271, Springer, Heidelberg, pp 458–465
Sohn SY (1999) Meta analysis of classification algorithms for pattern recognition. IEEE Trans Pattern Anal Mach Intell. 21(11):1137 –1144
Article Google Scholar
Todorovski L, Brazdil P, Soares C (2000) Report on the experiments with feature selection in meta-level learning. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP, pp 27–39
Vlachos P StatLib datasets archive (1998) http://lib.stat.cmu.edu Department of Statistics, Carnegie Mellon University
Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7):1341–1390
Article Google Scholar

Download references

Author information

Authors and Affiliations

German Research Center for Artificial Intelligence (DFKI), Trippstadter Strasse 122, 67663, Kaiserslautern, Germany
Matthias Reif, Faisal Shafait, Markus Goldstein & Andreas Dengel
Department of Computer Science, University of Kaiserslautern, Postfach 3049, 67653, Kaiserslautern, Germany
Thomas Breuel

Authors

Matthias Reif
View author publications
You can also search for this author in PubMed Google Scholar
Faisal Shafait
View author publications
You can also search for this author in PubMed Google Scholar
Markus Goldstein
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Breuel
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Dengel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthias Reif.

Appendix: A selected meta-features

See Table 5

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reif, M., Shafait, F., Goldstein, M. et al. Automatic classifier selection for non-experts. Pattern Anal Applic 17, 83–96 (2014). https://doi.org/10.1007/s10044-012-0280-z

Download citation

Received: 04 May 2011
Accepted: 25 June 2012
Published: 17 July 2012
Issue Date: February 2014
DOI: https://doi.org/10.1007/s10044-012-0280-z

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Selection: Traditional and Wrapping Techniques with Tabu Search

Featuring the Attributes in Supervised Machine Learning

An extensive experimental evaluation of automated machine learning methods for recommending classification algorithms

References

Author information

Authors and Affiliations

Corresponding author

Appendix: A selected meta-features

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Automatic classifier selection for non-experts

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Selection: Traditional and Wrapping Techniques with Tabu Search

Featuring the Attributes in Supervised Machine Learning

An extensive experimental evaluation of automated machine learning methods for recommending classification algorithms

References

Author information

Authors and Affiliations

Corresponding author

Appendix: A selected meta-features

Appendix: A selected meta-features

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation