[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3449726.3463187acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Using deep Q-network for selection hyper-heuristics

Published: 08 July 2021 Publication History

Abstract

Hyper-Heuristics is an active research field that aims to automatically select (or generate) the best low-level heuristic in each step of the search process. This work investigates a Hyper-Heuristic with a Deep Q-Network (DQN) selection strategy and compares it with two state-of-the-art approaches, namely the Dynamic MAB and the Fitness-Rate-Rank MAB. The experiments conducted on two domains from the HyFlex framework showed that the DQN approach outperformed the others on the Vehicle Routing Problem and was competitive on the Traveling Salesman Problem. This indicates that the DQN is a robust selection strategy that is less sensitive to the domain than the MAB based approaches.

References

[1]
Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2 (2002), 235--256.
[2]
Christian Blum, Jakob Puchinger, Günther R. Raidl, and Andrea Roli. 2011. Hybrid metaheuristics in combinatorial optimization: A survey. Applied Soft Computing 11, 6 (2011), 4135 -- 4151.
[3]
Edmund K. Burke, Matthew Hyde, Graham Kendall, Gabriela Ochoa, Ender Özcan, and John R. Woodward. 2010. A Classification of Hyper-heuristic Approaches. Springer US, Boston, MA, 449--468.
[4]
Luis DaCosta, Alvaro Fialho, Marc Schoenauer, and Michèle Sebag. 2008. Adaptive Operator Selection with Dynamic Multi-Armed Bandits. In Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO '08). Association for Computing Machinery, New York, NY, USA, 913--920.
[5]
Carolina P. de Almeida, Richard A. Gonçalves, Sandra M. Venske, Ricardo Lüders, and Myriam Regattieri Delgado. 2018. Multi-armed Bandit Based Hyper-Heuristics for the Permutation Flow Shop Problem. In 7th Brazilian Conference on Intelligent Systems, BRACIS 2018, São Paulo, Brazil, October 22-25, 2018. IEEE Computer Society, 139--144.
[6]
John H. Drake, Ahmed Kheiri, Ender Özcan, and Edmund K. Burke. 2020. Recent advances in selection hyper-heuristics. European Journal of Operational Research 285, 2 (2020), 405--428.
[7]
A. S. Ferreira, R. A. Gonçalves, and A. Pozo. 2017. A Multi-Armed Bandit selection strategy for Hyper-heuristics. In 2017 IEEE Congress on Evolutionary Computation (CEC). 525--532.
[8]
Álvaro Fialho. 2010. Adaptive Operator Selection for Optimization. Theses. Université Paris Sud - Paris XI. https://tel.archives-ouvertes.fr/tel-00578431
[9]
Stephanus Daniel Handoko, Duc Thien Nguyen, Zhi Yuan, and Hoong Chuin Lau. 2014. Reinforcement Learning for Adaptive Operator Selection in Memetic Search Applied to Quadratic Assignment Problem. In Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO Comp '14). Association for Computing Machinery, New York, NY, USA, 193--194.
[10]
K. Li, Á. Fialho, S. Kwong, and Q. Zhang. 2014. Adaptive Operator Selection With Bandits for a Multiobjective Evolutionary Algorithm Based on Decomposition. IEEE Transactions on Evolutionary Computation 18, 1 (2014), 114--130.
[11]
G. Ochoa, M. Hyde, T. Curtois, J.A. Vazquez-Rodriguez, J. Walker, M. Gendreau, G. Kendall, B. McCollum, A.J. Parkes, S. Petrovic, and E.K. Burke. 2012. HyFlex: A Benchmark Framework for Cross-domain Heuristic Search. 7245 (2012), 136--147.
[12]
Martin L. Puterman. 1990. Chapter 8 Markov Decision Processes. In Handbooks in Operations Research and Management Science. Stochastic Models, Vol. 2. Elsevier, 331--434.
[13]
Jorge A. Soria-Alcaraz, Gabriela Ochoa, Marco A. Sotelo-Figeroa, and Edmund K. Burke. 2017. A methodology for determining an effective subset of heuristics in selection hyper-heuristics. European Journal of Operational Research 260, 3 (2017), 972--983.
[14]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning, Second Edition: An Introduction. MIT Press.
[15]
Teck-Hou Teng, Stephanus Daniel Handoko, and Hoong Chuin Lau. 2016. Self-Organizing Neural Network for Adaptive Operator Selection in Evolutionary Search. In Learning and Intelligent Optimization (Lecture Notes in Computer Science), Paola Festa, Meinolf Sellmann, and Joaquin Vanschoren (Eds.). Springer International Publishing, Cham, 187--202.
[16]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-Learning. Machine Learning 8, 3 (May 1992), 279--292.
[17]
D. H. Wolpert and W. G. Macready. 1997. No Free Lunch Theorems for Optimization. IEEE Transactions on Evolutionary Computation 1, 1 (April 1997), 67--82.

Cited By

View all
  • (2024)A review of reinforcement learning based hyper-heuristicsPeerJ Computer Science10.7717/peerj-cs.214110(e2141)Online publication date: 28-Jun-2024
  • (2024)Learning from Offline and Online Experiences: A Hybrid Adaptive Operator Selection FrameworkProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654062(1017-1025)Online publication date: 14-Jul-2024
  • (2023)PHH: Policy-Based Hyper-Heuristic With Reinforcement LearningIEEE Access10.1109/ACCESS.2023.327795311(52026-52049)Online publication date: 2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference Companion
July 2021
2047 pages
ISBN:9781450383516
DOI:10.1145/3449726
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. combinatorial optimization
  2. hyper-heuristic
  3. reinforcement learning

Qualifiers

  • Research-article

Funding Sources

  • CAPES
  • CNPq

Conference

GECCO '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)8
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A review of reinforcement learning based hyper-heuristicsPeerJ Computer Science10.7717/peerj-cs.214110(e2141)Online publication date: 28-Jun-2024
  • (2024)Learning from Offline and Online Experiences: A Hybrid Adaptive Operator Selection FrameworkProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654062(1017-1025)Online publication date: 14-Jul-2024
  • (2023)PHH: Policy-Based Hyper-Heuristic With Reinforcement LearningIEEE Access10.1109/ACCESS.2023.327795311(52026-52049)Online publication date: 2023
  • (2023)Automated algorithm design using proximal policy optimisation with identified featuresExpert Systems with Applications10.1016/j.eswa.2022.119461216(119461)Online publication date: Apr-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media