[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3377930.3389805acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Improving generalisation of AutoML systems with dynamic fitness evaluations

Published: 26 June 2020 Publication History

Abstract

A common problem machine learning developers are faced with is overfitting, that is, fitting a pipeline too closely to the training data that the performance degrades for unseen data. Automated machine learning aims to free (or at least ease) the developer from the burden of pipeline creation, but this overfitting problem can persist. In fact, this can become more of a problem as we look to iteratively optimise the performance of an internal cross-validation (most often k-fold). While this internal cross-validation hopes to reduce this overfitting, we show we can still risk overfitting to the particular folds used. In this work, we aim to remedy this problem by introducing dynamic fitness evaluations which approximate repeated k-fold cross-validation, at little extra cost over single k-fold, and far lower cost than typical repeated k-fold. The results show that when time equated, the proposed fitness function results in significant improvement over the current state-of-the-art baseline method which uses an internal single k-fold. Furthermore, the proposed extension is very simple to implement on top of existing evolutionary computation methods, and can provide essentially a free boost in generalisation/testing performance.

References

[1]
Yoshua Bengio and Yves Grandvalet. 2004. No unbiased estimator of the variance of k-fold cross-validation. Journal of machine learning research 5, Sep (2004), 1089--1105.
[2]
Remco R Bouckaert and Eibe Frank. 2004. Evaluating the replicability of significance tests for comparing learning algorithms. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 3--12.
[3]
Janez Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. Journal of Machine learning research 7, Jan (2006), 1--30.
[4]
Thomas G Dietterich. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation 10, 7 (1998), 1895--1923.
[5]
Benjamin Evans. 2019. Population-based Ensemble Learning with Tree Structures for Classification. (2019).
[6]
Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and Robust Automated Machine Learning. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 2962--2970. http://papers.nips.cc/paper/5872-efficient-and-robust-automated-machine-learning.pdf
[7]
P. Gijsbers, E. LeDell, S. Poirier, J. Thomas, B. Bischl, and J. Vanschoren. 2019. An Open Source AutoML Benchmark. arXiv preprint arXiv:1907.00909 [cs.LG] (2019). https://arxiv.org/abs/1907.00909 Accepted at AutoML Workshop at ICML 2019.
[8]
Isabelle Guyon, Lisheng Sun-Hosoya, Marc Boullé, Hugo Jair Escalante, Sergio Escalera, Zhengying Liu, Damir Jajetic, Bisakha Ray, Mehreen Saeed, Michèle Sebag, et al. 2019. Analysis of the AutoML Challenge Series 2015--2018. In Automated Machine Learning. Springer, 177--219.
[9]
Gregory S Hornby. 2006. ALPS: the age-layered population structure for reducing the problem of premature convergence. In Proceedings of the 8th annual conference on Genetic and evolutionary computation. ACM, 815--822.
[10]
Lars Kotthoff, Chris Thornton, Holger H Hoos, Frank Hutter, and Kevin Leyton-Brown. 2017. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. The Journal of Machine Learning Research 18, 1 (2017), 826--830.
[11]
John R Koza. 1997. Genetic programming. (1997).
[12]
Damjan Krstajic, Ljubomir J Buturovic, David E Leahy, and Simon Thomas. 2014. Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of cheminformatics 6, 1 (2014), 10.
[13]
Ludmila I Kuncheva and James C Bezdek. 1998. Nearest prototype classification: Clustering, genetic algorithms, or random search? IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 28, 1 (1998), 160--164.
[14]
TT Le, W Fu, and JH Moore. 2019. Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Bioinformatics (Oxford, England) (2019).
[15]
Claude Nadeau and Yoshua Bengio. 2000. Inference for the generalization error. In Advances in neural information processing systems. 307--313.
[16]
Randal S. Olson, Nathan Bartley, Ryan J. Urbanowicz, and Jason H. Moore. 2016. Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO '16). ACM, New York, NY, USA, 485--492.
[17]
Randal S. Olson, Ryan J. Urbanowicz, Peter C. Andrews, Nicole A. Lavender, La Creis Kidd, and Jason H. Moore. 2016. Applications of Evolutionary Computation: 19th European Conference, EvoApplications 2016, Porto, Portugal, March 30 -- April 1, 2016, Proceedings, Part I. Springer International Publishing, Chapter Automating Biomedical Data Science Through Tree-Based Pipeline Optimization, 123--137.
[18]
Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4780--4789.
[19]
Chris Thornton, Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 847--855.
[20]
Anh Truong, Austin Walters, Jeremy Goodsitt, Keegan Hines, Bayan Bruss, and Reza Farivar. 2019. Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools. arXiv preprint arXiv:1908.05557 (2019).
[21]
Gitte Vanwinckelen and Hendrik Blockeel. 2012. On estimating model accuracy with repeated cross-validation. In BeneLearn 2012: Proceedings of the 21st Belgian-Dutch Conference on Machine Learning. 39--44.
[22]
Peter A Whigham and Grant Dick. 2009. Implicitly controlling bloat in genetic programming. IEEE Transactions on Evolutionary Computation 14, 2 (2009), 173--190.
[23]
Frank Wilcoxon. 1945. Individual Comparisons by Ranking Methods. Biometrics Bulletin 1, 6 (1945), 80--83. http://www.jstor.org/stable/3001968
[24]
Yongli Zhang and Yuhong Yang. 2015. Cross-validation for selecting a model selection procedure. Journal of Econometrics 187, 1 (2015), 95--112.

Cited By

View all
  • (2024)SR-Forest: A Genetic Programming-Based Heterogeneous Ensemble Learning MethodIEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.324317228:5(1484-1498)Online publication date: Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference
June 2020
1349 pages
ISBN:9781450371285
DOI:10.1145/3377930
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. AutoML
  2. automated machine learning
  3. dynamic fitness evaluations
  4. generalisation
  5. regularization
  6. regularized evolution

Qualifiers

  • Research-article

Conference

GECCO '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SR-Forest: A Genetic Programming-Based Heterogeneous Ensemble Learning MethodIEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.324317228:5(1484-1498)Online publication date: Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media