More Web Proxy on the site http://driver.im/

research-article

Learning to walk - reward relevance within an enhanced neuroevolution approach

Authors:

A. Della Cioppa,

A. MarcelliAuthors Info & Claims

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion

Pages 1832 - 1840

https://doi.org/10.1145/3377929.3398126

Published: 08 July 2020 Publication History

Abstract

Recent advances in human motion sensing technologies and machine learning have enhanced the potential of AI to simulate artificial agents exhibiting human-like movements. Human movements are typically explored via experimental recordings with the aim of establishing relationships between neural and mechanical activities. A recent trend in AI research shows that AI algorithms work remarkably well when combined with sufficient computing resources and data. One common criticism is that all of these methods are gradient-based, which involves a gradient approximation, thus suffering of the well-known vanishing gradient problem.

In this paper, the goal is to build an ANN-based controller that enables an agent to walk in a human-like way. In particular, the proposed methodology is based on a new approach to Neuroevolution based on NeuroEvolution of Augmenting Topologies (NEAT). The original algorithm has been endowed with a different type of selection-reproduction mechanism and a modified management of the population, with the aim to improve the performance and to reduce the computational effort. Experiments have evidenced the effectiveness of the proposed approach and have highlighted the interdependence among three key aspects: the reward framework, the Evolutionary Algorithm chosen and the hyper-parameters' configuration. As a consequence, none of the above aspects can be ignored and their balancing is crucial for achieving suitable results and good performance.

References

[1]

F. Amato, A. Lopez, M.E. Pena-Mendez, P. Vanhara, A. Hampl, and J. Havel. 2013. Artificial neural networks in medical diagnosis. Journal of Applied Biomedicine 11, 2 (2013), 47--58.

[2]

L. Arnold, S. Rebecchi, S. Chevallier, and H. Paugam-Moisy. 2011. An Introduction to Deep Learning. Proceedings of the European Symposium of Artificial Neural Network Vol. 1, 477--488.

[3]

T Blickle and L Thiele. 1996. A Comparison of Selection Schemes Used in Evolutionary Algorithms. Evolutionary Computation 4, 4 (1996), 361--394.

Digital Library

[4]

C.B. Casey and Kris H. 2013. Artificial intelligence framework for simulating clinical decision-making: A Markov decision process approach. Artificial Intelligence in Medicine 57, 1 (2013), 9 -- 19.

Digital Library

[5]

I. Colucci, G. Pellegrino, A. Della Cioppa, and A. Marcelli. [n.d.]. Independent Leg reward on MuJoCo's Walker2D environment. https://www.youtube.com/watch?v=sFZeF2rfDd4

[6]

S. Debnath, A. Devos, E. Heiden, R. Julian, and F. Khatana. [n.d.]. Humanoid Imitation Learning from Diverse Sources. https://www.endtoend.ai/envs/gym/mujoco/walker2d/

[7]

N. Di Palo. [n.d.]. Learning to walk with evolutionary algorithms applied to a bio-mechanical model. https://towardsdatascience.com/learning-to-walk-with-evolutionary-algorithms-applied-to-a-bio-mechanical-model-1ccc094537ce

[8]

S.E. Dilsizian and E.L. Siegel. 2013. Artificial Intelligence in Medicine and Cardiac Imaging: Harnessing Big Data and Advanced Computing to Provide Personalized Medical Diagnosis and Treatment. Current Cardiology Reports 16, 1 (13 Dec 2013), 441.

[9]

Nicolas Heess et al. 2017. Emergence of Locomotion Behaviours in Rich Environments. CoRR (2017). arXiv:1707.02286

[10]

Holly Grimm. 2019. Imitiation Learning and Mujoco. https://hollygrimm.com/rl_bc

[11]

J. Ho and S. Ermon. 2016. Generative Adversarial Imitation Learning. arXiv:1606.03476

[12]

Sepp Hochreiter. 1991. Untersuchungen zu dynamischen neuronalen Netzen. (1991).

[13]

Watkins C. J. and Dayan P. 1992. Technical Note: Q-learning. Machine Learning 8 (1992), 279--292.

Digital Library

[14]

A. Karniel. 2011. Open questions in computational motor control. Journal of integrative neuroscience 10, 3 (2011), 385--411.

[15]

L. et al. Kidzinski. 2018. Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. CoRR (2018). arXiv:1804.00361

[16]

A. Krizhevsky, I. Sutskever, and G.E Hinton. 2017. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 6, 6 (2017), 84--90.

Digital Library

[17]

Stanford Neuromuscular Biomechanics Laboratory. [n.d.]. http://osim-rl.stanford.edu/docs/models/#arm2denv

[18]

Stanford Neuromuscular Biomechanics Laboratory. [n.d.]. NeurIPS 2019: Learn to Move - Walk Around. https://github.com/stanfordnmbl/osim-rl

[19]

Yuxi Li. 2017. Deep Reinforcement Learning: An Overview. arXiv:cs.LG/1701.07274

[20]

T.P. Lillicrap, J.J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv:cs.LG/1509.02971

[21]

S.A. McLeod. 2018. Skinner - Operant Conditioning. Simply Psychology (2018).

[22]

L.C. Melo, M.R.O.A. Maximo, and A.M. Cunha. 2019. Learning Humanoid Robot Motions Through Deep Neural Networks. arXiv:1901.00270

[23]

V. Mnih, K. Kavukcuoglu, D. Silver, Graves. A., I. Antonoglou, D. Wierstra, and M. Riedmiller. 2013. Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602

[24]

OpenAI. [n.d.]. Deep Deterministic Policy Gradient. https://spinningup.openai.com/en/latest/algorithms/ddpg.html

[25]

OpenAI. [n.d.]. A python library that is a collection of test problems. https://gym.openai.com/envs/#classic_control

[26]

OpenAI. [n.d.]. A set of continuous control tasks, running in a fast physics simulator. https://gym.openai.com/envs/Walker2d-v2/

[27]

A. N. Ramesh, C. Kambhampati, J. R. T. Monson, and P. J. Drew. 2004. Artificial intelligence in medicine. Annals of the Royal College of Surgeons of England 86, 5 (Sep 2004), 334--338.

[28]

T. Salimans, J. Ho, X. Chen, Sidor. S., and I. Sutskever. 2017. Evolution Strategies as a Scalable Alternative to Reinforcement Learning. arXiv:1703.03864

[29]

T. Salimans, J. Ho, X. Chen, S. Sidor, and I. Sutskever. 2017. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864 (2017).

[30]

M. Sanjeevi. [n.d.]. Ch 12:Reinforcement learning Complete Guide towardsAGI. https://medium.com/deep-math-machine-learning-ai/ch-12-reinforcement-learning-complete-guide-towardsagi-ceea325c5d53

[31]

J. Schulman, Wolski. F., P. Dhariwal, A. Radford, and O. Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv:1707.06347

[32]

J. Schulman, S. Levine, P. Moritz, M.J. Jordan, and P. Abbeel. 2015. Trust Region Policy Optimization. arXiv:1502.05477

[33]

F. Sehnke, C. Osendorfer, T. Rückstieß, A. Graves, Peters J., and Schmidhuber J. 2010. Parameter-exploring policy gradients. Neural Networks 23, 4 (2010), 551--559.

Digital Library

[34]

F. Seide, G. Li, and D. Yu. 2011. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks. In Interspeech 2011 (interspeech 2011 ed.). International Speech Communication Association.

[35]

R.L. Seungjae. [n.d.]. MuJoCo Walker2D Environment. https://www.endtoend.ai/envs/gym/mujoco/walker2d/

[36]

D. Silver, G. Lever, N.M.O. Heess, T. Degris, Wierstra. D., and M.A. Riedmiller. 2014. Deterministic Policy Gradient Algorithms. In ICML.

[37]

S. Song and H. Geyer. 2015. A neural circuitry that emphasizes spinal feedback generates diverse behaviours of human locomotion. The Journal of physiology 593, 16 (2015), 3493--3511.

[38]

Kenneth O. Stanley. [n.d.]. NEAT-Python. https://neat-python.readthedocs.io/en/latest/

Index Terms

Learning to walk - reward relevance within an enhanced neuroevolution approach
1. Computing methodologies
  1. Artificial intelligence
    1. Control methods
    2. Search methodologies
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Multiagent learning through neuroevolution
WCCI'12: Proceedings of the 2012 World Congress conference on Advances in Computational Intelligence

Neuroevolution is a promising approach for constructing intelligent agents in many complex tasks such as games, robotics, and decision making. It is also well suited for evolving team behavior for many multiagent tasks. However, new challenges and ...
Momentum Enhanced Neuroevolution
GECCO Companion '15: Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation

The momentum parameter is common within numerous optimization and local search algorithms, particularly in the popular back propagation neural network learning algorithm. Computationally cheap and prevalent in gradient descent approaches, it is not ...
Learning human-like behaviors using neuroevolution with statistical penalties
2017 IEEE Conference on Computational Intelligence and Games (CIG)
In game artificial intelligence (AI), two common directions for developing non-human computer players are strong AI and human-like AI. Human-like AI aims at making computer agents behave like humans. In this direction, NeuroEvolution (NE), which is a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion

July 2020

1982 pages

ISBN:9781450371278

DOI:10.1145/3377929

General Chair:
Carlos Artemio Coello Coello
CINVESTAV-IPN

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

GECCO '20

Sponsor:

SIGEVO

GECCO '20: Genetic and Evolutionary Computation Conference

July 8 - 12, 2020

Cancún, Mexico

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
91
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)1

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents