More Web Proxy on the site http://driver.im/

research-article

The Quality-Diversity Transformer: Generating Behavior-Conditioned Trajectories with Decision Transformers

Authors:

Valentin Macé,

Raphaël Boige,

Felix Chalumeau,

Thomas Pierrot,

Guillaume Richard,

Nicolas Perrin-GilbertAuthors Info & Claims

GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference

Pages 1221 - 1229

https://doi.org/10.1145/3583131.3590433

Published: 12 July 2023 Publication History

Abstract

In the context of neuroevolution, Quality-Diversity algorithms have proven effective in generating repertoires of diverse and efficient policies by relying on the definition of a behavior space. A natural goal induced by the creation of such a repertoire is trying to achieve behaviors on demand, which can be done by running the corresponding policy from the repertoire. However, in uncertain environments, two problems arise. First, policies can lack robustness and repeatability, meaning that multiple episodes under slightly different conditions often result in very different behaviors. Second, due to the discrete nature of the repertoire, solutions vary discontinuously. Here we present a new approach to achieve behavior-conditioned trajectory generation based on two mechanisms: First, MAP-Elites Low-Spread (ME-LS), which constrains the selection of solutions to those that are the most consistent in the behavior space. Second, the Quality-Diversity Transformer (QDT), a Transformer-based model conditioned on continuous behavior descriptors, which trains on a dataset generated by policies from a ME-LS repertoire and learns to autoregressively generate sequences of actions that achieve target behaviors. Results show that ME-LS produces consistent and robust policies, and that its combination with the QDT yields a single policy capable of achieving diverse behaviors on demand with high accuracy.

Supplementary Material

PDF File (p1221-mace-suppl.pdf)

Supplemental material.

Download
1.39 MB

References

[1]

Alberto Alvarez, Steve Dahlskog, Jose Font, and Julian Togelius. 2019. Empowering quality diversity in dungeon design with interactive constrained MAP-Elites. In 2019 IEEE Conference on Games (CoG). IEEE, 1--8.

Digital Library

[2]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).

[3]

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax

[4]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.

[5]

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part I 16. Springer, 213--229.

[6]

Leo Cazenille, Nicolas Bredeche, and Nathanael Aubert-Kato. 2019. Exploring Self-Assembling Behaviors in a Swarm of Bio-micro-robots using Surrogate-Assisted MAP-Elites. arXiv preprint arXiv:1910.00230 (2019).

[7]

Felix Chalumeau, Raphael Boige, Bryan Lim, Valentin Macé, Maxime Allard, Arthur Flajolet, Antoine Cully, and Thomas Pierrot. 2022. Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery. arXiv preprint arXiv:2210.03516 (2022).

[8]

Megan Charity, Ahmed Khalifa, and Julian Togelius. 2020. Baba is Y'all: Collaborative Mixed-Initiative Level Design. arXiv preprint arXiv:2003.14294 (2020).

[9]

Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, and Igor Mordatch. 2021. Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems 34 (2021), 15084--15097.

[10]

Cédric Colas, Vashisht Madhavan, Joost Huizinga, and Jeff Clune. 2020. Scaling MAP-Elites to deep neuroevolution. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference. 67--75.

Digital Library

[11]

Antoine Cully, Jeff Clune, Danesh Tarapore, and Jean-Baptiste Mouret. 2015. Robots that can adapt like animals. Nature 521, 7553 (2015), 503--507.

[12]

Antoine Cully and Yiannis Demiris. 2017. Quality and diversity optimization: A unifying modular framework. IEEE Transactions on Evolutionary Computation 22, 2 (2017), 245--259.

[13]

Antoine Cully and Yiannis Demiris. 2018. Hierarchical behavioral repertoires with unsupervised descriptors. In Proceedings of the Genetic and Evolutionary Computation Conference. 69--76.

Digital Library

[14]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[15]

Linhao Dong, Shuang Xu, and Bo Xu. 2018. Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5884--5888.

Digital Library

[16]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).

[17]

Sondre A Engebraaten, Jonas Moen, Oleg A Yakimenko, and Kyrre Glette. 2020. A framework for automatic behavior generation in multi-function swarms. Frontiers in Robotics and AI 7 (2020), 579403.

[18]

Manon Flageat, Felix Chalumeau, and Antoine Cully. 2022. Empirical analysis of PGA-MAP-Elites for Neuroevolution in Uncertain Domains. ACM Transactions on Evolutionary Learning (2022).

[19]

Manon Flageat and Antoine Cully. 2020. Fast and stable MAP-Elites in noisy domains using deep grids. arXiv preprint arXiv:2006.14253 (2020).

[20]

Manon Flageat and Antoine Cully. 2023. Uncertain Quality-Diversity: Evaluation methodology and new methods for Quality-Diversity in Uncertain Domains. arXiv preprint arXiv:2302.00463 (2023).

[21]

Manon Flageat, Bryan Lim, Luca Grillotti, Maxime Allard, Simón C Smith, and Antoine Cully. 2022. Benchmarking Quality-Diversity Algorithms on Neuroevolution for Reinforcement Learning. arXiv preprint arXiv:2211.02193 (2022).

[22]

Matthew Fontaine and Stefanos Nikolaidis. 2020. A quality diversity approach to automatically generating human-robot interaction scenarios in shared autonomy. arXiv preprint arXiv:2012.04283 (2020).

[23]

Matthew C. Fontaine, Scott Lee, L. B. Soros, Fernando De Mesentier Silva, Julian Togelius, and Amy K. Hoover. 2019. Mapping Hearthstone Deck Spaces with Map-Elites with Sliding Boundaries. In Proceedings of The Genetic and Evolutionary Computation Conference. ACM.

[24]

C Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, and Olivier Bachem. 2021. Brax-A Differentiable Physics Engine for Large Scale Rigid Body Simulation. arXiv preprint arXiv:2106.13281 (2021).

[25]

Scott Fujimoto, Herke Van Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. arXiv preprint arXiv:1802.09477 (2018).

[26]

Michael Janner, Qiyang Li, and Sergey Levine. 2021. Offline reinforcement learning as one big sequence modeling problem. Advances in neural information processing systems 34 (2021), 1273--1286.

[27]

Marija Jegorova, Stéphane Doncieux, and Timothy M Hospedales. 2020. Behavioral repertoire via generative adversarial policy networks. IEEE Transactions on Cognitive and Developmental Systems (2020).

[28]

Niels Justesen, Sebastian Risi, and Jean-Baptiste Mouret. 2019. Map-elites for noisy domains by adaptive sampling. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. 121--122.

Digital Library

[29]

Kuang-Huei Lee, Ofir Nachum, Sherry Yang, Lisa Lee, C. Daniel Freeman, Sergio Guadarrama, Ian Fischer, Winnie Xu, Eric Jang, Henryk Michalewski, and Igor Mordatch. 2022. Multi-Game Decision Transformers. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.). https://openreview.net/forum?id=0gouO5saq6K

[30]

Bryan Lim, Maxime Allard, Luca Grillotti, and Antoine Cully. 2022. Accelerated Quality-Diversity for Robotics through Massive Parallelism. arXiv preprint arXiv:2202.01258 (2022).

[31]

Douglas Morrison, Peter Corke, and Jurgen Leitner. 2020. EGAD! an Evolved Grasping Analysis Dataset for diversity and reproducibility in robotic manipulation. IEEE Robotics and Automation Letters (2020).

[32]

Jean-Baptiste Mouret and Jeff Clune. 2015. Illuminating search spaces by mapping elites. arXiv preprint arXiv:1504.04909 (2015).

[33]

Olle Nilsson and Antoine Cully. 2021. Policy gradient assisted map-elites. In Proceedings of the Genetic and Evolutionary Computation Conference. 866--875.

Digital Library

[34]

Olle Nilsson and Antoine Cully. 2021. Policy Gradient Assisted MAP-Elites; Policy Gradient Assisted MAP-Elites. (2021).

Digital Library

[35]

Thomas Pierrot, Valentin Macé, Felix Chalumeau, Arthur Flajolet, Geoffrey Cideron, Karim Beguir, Antoine Cully, Olivier Sigaud, and Nicolas Perrin-Gilbert. 2022. Diversity Policy Gradient for Sample Efficient Quality-Diversity Optimization. In GECCO 2022 - Proceedings of the 2022 Genetic and Evolutionary Computation Conference.

[36]

Thomas Pierrot, Guillaume Richard, Karim Beguir, and Antoine Cully. 2022. Multi-Objective Quality Diversity Optimization. arXiv preprint arXiv:2202.03057 (2022).

[37]

Justin K Pugh, Lisa B Soros, and Kenneth O. Stanley. 2016. Quality diversity: A new frontier for evolutionary computation. Frontiers in Robotics and AI 3 (2016), 40.

[38]

Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. (2018).

[39]

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.

[40]

Nemanja Rakicevic, Antoine Cully, and Petar Kormushev. 2021. Policy manifold search: Exploring the manifold hypothesis for diversity-based neuroevolution. In Proceedings of the Genetic and Evolutionary Computation Conference. 901--909.

Digital Library

[41]

Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, Mai Gimenez, Yury Sulsky, Jackie Kay, Jost Tobias Springenberg, et al. 2022. A generalist agent. arXiv preprint arXiv:2205.06175 (2022).

[42]

Vassilis Vassiliades, Konstantinos I. Chatzilygeroudis, and Jean-Baptiste Mouret. 2016. Scaling Up MAP-Elites Using Centroidal Voronoi Tessellations. CoRR abs/1610.05729 (2016). arXiv:1610.05729 http://arxiv.org/abs/1610.05729

[43]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).

[44]

Tianping Zhang, Yuanqi Li, Yifei Jin, and Jian Li. 2020. AutoAlpha: an Efficient Hierarchical Evolutionary Algorithm for Mining Alpha Factors in Quantitative Investment. arXiv preprint arXiv:2002.08245 (2020).

Cited By

Xie YPinskier JWang XHoward D(2024)Evolutionary Seeding of Diverse Structural Design Solutions via Topology OptimizationACM Transactions on Evolutionary Learning and Optimization10.1145/3670693Online publication date: 5-Jun-2024
https://doi.org/10.1145/3670693
Batra STjanaka BNikolaidis SSukhatme GLi XHandl J(2024)Quality Diversity for Robot Learning: Limitations and Future DirectionsProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654431(587-590)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3654431
Mertan ACheney NLi XHandl J(2024)Towards Multi-Morphology Controllers with Diversity and Knowledge DistillationProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654013(367-376)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638529.3654013
Show More Cited By

Index Terms

The Quality-Diversity Transformer: Generating Behavior-Conditioned Trajectories with Decision Transformers
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Bio-inspired approaches
        Evolutionary robotics

Recommendations

Diversity policy gradient for sample efficient quality-diversity optimization
GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference

A fascinating aspect of nature lies in its ability to produce a large and diverse collection of organisms that are all high-performing in their niche. By contrast, most AI algorithms focus on finding a single eficient solution to a given problem. Aiming ...
Body and Brain Quality-Diversity in Robot Swarms
In biological societies, complex interactions between the behavior and morphology of evolving organisms and their environment have given rise to a wide range of complex and diverse social structures. Similarly, in artificial counterparts such as swarm-...
Specialization with NeuroEvolution in a collective behaviour task
GECCO '08: Proceedings of the 10th annual conference companion on Genetic and evolutionary computation

In Nature, behavioral specialization is ubiquitous. Groups benefit from complementary and specialized behaviors in individuals, especially in tasks requiring collective behavior. We apply four multiagent NeuroEvolution approaches to such a task: ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference

July 2023

1667 pages

ISBN:9798400701191

DOI:10.1145/3583131

Chair:
Sara Silva,
Program Chair:
Luís Paquete

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

GECCO '23

Sponsor:

SIGEVO

GECCO '23: Genetic and Evolutionary Computation Conference

July 15 - 19, 2023

Lisbon, Portugal

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
114
Total Downloads

Downloads (Last 12 months)77
Downloads (Last 6 weeks)5

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xie YPinskier JWang XHoward D(2024)Evolutionary Seeding of Diverse Structural Design Solutions via Topology OptimizationACM Transactions on Evolutionary Learning and Optimization10.1145/3670693Online publication date: 5-Jun-2024
https://doi.org/10.1145/3670693
Batra STjanaka BNikolaidis SSukhatme GLi XHandl J(2024)Quality Diversity for Robot Learning: Limitations and Future DirectionsProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654431(587-590)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3654431
Mertan ACheney NLi XHandl J(2024)Towards Multi-Morphology Controllers with Diversity and Knowledge DistillationProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654013(367-376)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638529.3654013
Mertan ACheney N(2024)No-brainer: Morphological Computation Driven Adaptive Behavior in Soft RobotsFrom Animals to Animats 1710.1007/978-3-031-71533-4_6(81-92)Online publication date: 7-Sep-2024
https://doi.org/10.1007/978-3-031-71533-4_6

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents