Abstract
Choosing a suitable classifier for a given dataset is an important part of developing a pattern recognition system. Since a large variety of classification algorithms are proposed in literature, non-experts do not know which method should be used in order to obtain good classification results on their data. Meta-learning tries to address this problem by recommending promising classifiers based on meta-features computed from a given dataset. In this paper, we empirically evaluate five different categories of state-of-the-art meta-features for their suitability in predicting classification accuracies of several widely used classifiers (including Support Vector Machines, Neural Networks, Random Forests, Decision Trees, and Logistic Regression). Based on the evaluation results, we have developed the first open source meta-learning system that is capable of accurately predicting accuracies of target classifiers. The user provides a dataset as input and gets an automatically created high-performance ready-to-use pattern recognition system in a few simple steps. A user study of the system with non-experts showed that the users were able to develop more accurate pattern recognition systems in significantly less development time when using our system as compared to using a state-of-the-art data mining software.
Similar content being viewed by others
References
Abdelmessih SD, Shafait F, Reif M, Goldstein M (2010) Landmarking for meta-learning using RapidMiner. In: RapidMiner community meeting and conference
Ali S, Smith KA (2006) On learning algorithm selection for classification. Applied Soft Comput. 6:119–138
Asuncion A, Newman D UCI machine learning repository (2007) http://www.ics.uci.edu/~mlearn/MLRepository.html University of California, Irvine, School of Information and Computer Sciences
Bensusan H, Giraud-Carrier C (2000) Casa batló is in passeig de gràcia or how landmark performances can describe tasks. In: Proceedings of the ECML-00 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 29–46
Bensusan H, Giraud-Carrier C, Kennedy C (2000) A higher-order approach to meta-learning. In: Proceedings of the ECML’2000 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 109–117
Bensusan H, Giraud-Carrier CG (2000) Discovering task neighbourhoods through landmark learning performances. In: PKDD ’00: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Springer-Verlag, London, UK, pp 325–330
Bensusan H, Kalousis A (2001) Estimating the predictive accuracy of a classifier. In: De Raedt L, Flach P (eds.) Machine Learning: ECML 2001, Lecture Notes in Computer Science, vol. 2167 Springer, Berlin, pp 25–36
Brazdil P, Soares C, da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn. 50(3):251–277
Brazdil PB, Soares C (2000) Zoomed ranking: Selection of classification algorithms based on relevant performance information. In: Proceedings of principles of data mining and knowledge discovery, 4th European conference (PKDD-2000). Springer, pp 126–135
Breiman L (2001) Random forests. Mach Learn. 45:5–32
Chang CC, Lin CJ LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn. 20(3):273–297
Engels R, Theusinger C (1998) Using a data metric for preprocessing advice for data mining applications. In: Proceedings of the European Conference on artificial intelligence (ECAI-98, Wiley, pp 430–434
Frasch JV, Lodwich A, Shafait F, Breuel TM (2011) A bayes-true data generator for evaluation of supervised and unsupervised learning methods. Pattern Recogn Lett. 32(11):1523–1531
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc. 84(405):165–175
Fürnkranz J, Petrak J (2001) An evaluation of landmarking variants. In: C. Giraud-Carrier, N. Lavrač, S. Moyle, B. Kavšek (eds.) Proceedings of the ECML/PKDD workshop on integrating aspects of data mining, decision support and meta-learning (IDDM-2001), Freiburg, Germany, pp 57–68
Gama J, Brazdil P (1995) Characterization of classification algorithms. In: C. Pinto-Ferreira, N. Mamede (eds.) Progress in artificial intelligence, Lecture Notes in Computer Science, vol. 990, Springer Heidelberg, pp 189–200
Giraud-Carrier C (2005) The data mining advisor: meta-learning at the service of practitioners. In: Proceedings of the fourth international conference on machine learning and applications, 2005, pp 113–119
Hilario M, Nguyen P, Do H, Woznica A, Kalousis A (2011) Ontology-based meta-mining of knowledge discovery workflows. In: Jankowski N, Duch W, Grąbczewski K (eds.) Meta-Learning in Computational Intelligence, Studies in Computational Intelligence, vol. 358, Springer Heidelberg, pp 273–315
John GH, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: International Conference on machine learning, Morgan Kaufmann, pp 121–129
Kalousis A, Hilario M (2001) Feature selection for meta-learning. In: Cheung D, Williams G, Li Q (eds.) Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, vol. 2035, Springer Heidelberg, pp 222–233
Kietz JU, Serban F, Bernstein A, Fischer S (2010) Data mining workflow templates for intelligent discovery assistance and auto-experimentation. In: Proceedings of the ECML/PKDD-10 Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery, pp 1–12
King RD, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell. 9(3):289–333
Köpf C, Taylor C, Keller J (2000) Meta-analysis: from data characterisation for meta-learning to meta-regression. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP
Lindner G, Studer R (1999) Ast: support for algorithm selection with a cbr approach. In: Recent Advances in Meta-Learning and Future Work, pp 418–423
Esprit project METAL (#26.357): A meta-learning assistant for providing user support in data mining and machine learning (1999–2002). http://www.ofai.at/research/impml/metal/
Michie D, Spiegelhalter D, Taylor C (1994) Machine Learning, Neural & Statistical Classification. Ellis Horwood, Chichester
Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) Yale: Rapid prototyping for complex data mining tasks. In: Ungar L, Craven M, Gunopulos D, Eliassi-Rad T (eds.) KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, pp 935–940
Peng Y, Flach P, Soares C, Brazdil P (2002) Improved dataset characterisation for meta-learning. In: S. Lange, K. Satoh, C. Smith (eds.) Discovery Science, Lecture Notes in Computer Science, vol. 2534, Springer, Heidelberg, pp 193–208
Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. In: In Proceedings of the Seventeenth international conference on machine learning, Morgan Kaufmann, pp 743–750
Piatetsky-Shapiro G (2010) Data mining / analytic tools used poll http://www.kdnuggets.com/polls/2010/data-mining-analytics-tools.html
Qiao Z, Zhou L, Huang JZ (2009) Sparse linear discriminant analysis with applications to high dimensional low sample size data. IAENG Int J Appl Math. 39(1):48–60
Quinlan JR Cubist. http://www.rulequest.com/cubist-info.html
Quinlan JR (1992) Learning with continuous classes. In Proceedings AI’92, pp. 343–348
Quinlan R (2002) Data mining tools see5 and c5.0. http://www.rulequest.com/see5-info.html
Rendell L, Cho H (1990) Empirical learning as a function of concept character. Mach Learn. 5:267–298
Rice JR (1976) The algorithm selection problem. Adv Comput. 15:65–118
Segrera S, Pinho J, Moreno M (2008) Information-theoretic measures for meta-learning. In: Corchado E, Abraham A, Pedrycz W (eds.) Hybrid Artificial Intelligence Systems, Lecture Notes in Computer Science, vol. 5271, Springer, Heidelberg, pp 458–465
Sohn SY (1999) Meta analysis of classification algorithms for pattern recognition. IEEE Trans Pattern Anal Mach Intell. 21(11):1137 –1144
Todorovski L, Brazdil P, Soares C (2000) Report on the experiments with feature selection in meta-level learning. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP, pp 27–39
Vlachos P StatLib datasets archive (1998) http://lib.stat.cmu.edu Department of Statistics, Carnegie Mellon University
Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7):1341–1390
Author information
Authors and Affiliations
Corresponding author
Appendix: A selected meta-features
Appendix: A selected meta-features
See Table 5
Rights and permissions
About this article
Cite this article
Reif, M., Shafait, F., Goldstein, M. et al. Automatic classifier selection for non-experts. Pattern Anal Applic 17, 83–96 (2014). https://doi.org/10.1007/s10044-012-0280-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-012-0280-z