We present a scalable two-level architecture for Hexapod locomotion through complex terrain without the use of exteroceptive sensors. Our approach assumes that the target complex terrain can be modeled by N discrete terrain distributions which capture individual difficulties of the target terrain. Expert policies (physical locomotion controllers) modeled by Artificial Neural Networks are trained independently in these individual scenarios using Deep Reinforcement Learning. These policies are then autonomously multiplexed during inference using a Recurrent Neural Network terrain classifier conditioned on the state history, giving an adaptive gait appropriate for the current terrain. We perform several tests to assess policy robustness by changing various parameters, such as contact, friction and actuator properties. We also show experiments of goal-based positional control of such a system and a way of selecting several gait criteria during deployment, giving us a complete solution for blind Hexapod locomotion in a practical setting. The Hexapod platform and all our experiments are modeled in the MuJoCo [1] physics simulator. Demonstrations are available in the supplementary video.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Vespignani, M., Friesen, J.M., SunSpiral, V., Bruce, J.: Design of superball v2, a compliant tensegrity robot for absorbing large impacts. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2865-2871 (2018). DOI 10.1109/IROS.2018.8594374
Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. CoRRabs/1609.05521 (2016). URL http://arxiv.org/abs/1609.05521
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017). URL http://arxiv.org/abs/1707.06347
Wang, T., Liao, R., Ba, J., Fidler, S.: Nervenet: Learning structured policy with graph neural networks. In: International Conference on Learning Representations (2018). URL https://openreview.net/forum?id=S1sqHMZCb
Peng, X.B., Berseth, G., van de Panne, M.: Terrain-adaptive locomotion skills using deep reinforcement learning. ACM Transactions on Graphics (Proc. SIGGRAPH 2016) 35(4) (2016)
Bjelonic, M., Kottege, N., Homberger, T., Borges, P., Beckerle, P.: Chli, M.:Weaver: Hexapod robot for autonomous navigation on unstructured terrain. Journal of Field Robotics. 35, 1063–1079 (2018). https://doi.org/10.1002/rob.21795
Yu, W., Turk, G., Liu, C.K.: Learning symmetry and low-energy locomotion. CoRRabs/1801.08093 (2018). URL http://arxiv.org/abs/1801.08093
Boston dynamics, spot. https://www.bostondynamics.com/spot. Accessed: 16-10-2019
Ijspeert, A.J.: Central pattern generators for locomotion control in animals and robots: A review. Neural networks: the official journal of the International Neural Network Society. 21(4), 642–653 (2008)
Trossen robotics. https://www.trossenrobotics.com/. Accessed: 22-05-2010
Isvara, Y., Rachmatullah, S., Mutijarsa, K., Prabakti, D.E., Pragitatama, W.: Terrain adaptation gait algorithm in a hexapod walking robot. In: 2014 13th International Conference on Control Automation Robotics Vision (ICARCV), pp. 1735-1739 (2014). DOI 10.1109/ICARCV.2014.7064578
Open AI, Andrychowicz, M., Baker, B., Chociej, M., Józefowicz, R., McGrew, B., Pachocki, J.W., Pachocki, J., Petron, A., Plappert, M., Powell, G., Ray, A., Schneider, J., Sidor, S., Tobin, J., Welinder, P., Weng, L., Zaremba, W.: Learning dexterous in-hand manipulation. CoRR abs/1808.00177 (2018). URL http://arxiv.org/abs/1808.00177
Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 31(5), 855–868 (2009). https://doi.org/10.1109/TPAMI.2008.137
Kruijff, G., Kruijff-Korbayová, I., Keshavdas, S., Larochelle, B., Janícek, M., Colas, F., Liu, M., Pomerleau, F., Siegwart, R., Neerincx, M., Looije, R., Smets, N., Mioch, T., van Diggelen, J., Pirri, F., Gianni, M., Ferri, F., Menna, M., Worst, R., Linder, T., Tretyakov, V., Surmann, H., Svoboda, T., Reinštein, M., Zimmermann, K., Petříćek, T., Hlaváč, V.: Designing, developing, and deploying systems to support human-robot teams in disaster response. Advanced Robotics. 28(23), 1547–1570 (2014). https://doi.org/10.1080/01691864.2014.985335
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: S.A. Solla, T.K. Leen, K. Muller (eds.) Advances in Neural Information Processing Systems 12, pp. 1057-1063. MIT Press (2000). URL http://papers.nips.cc/paper/1713-policy-gradient-methods-for-reinforcement-learning-with-function-approximation.pdf
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. CoRRabs/1703.06907 (2017). URL http://arxiv.org/abs/1703.06907
Perlin, K.: Improving noise. ACM Trans. Graph. 21(3), 681–682 (2002). https://doi.org/10.1145/566654.566636
Wierstra, D., Schaul, T., Glasmachers, T., Sun, Y., Peters, J., Schmidhuber, J.: Natural evolution strategies. Journal of Machine Learning Research 15, 949-980 (2014). URL http://jmlr.org/papers/v15/wierstra14a.html
Hutter, M., Gehring, C., Lauber, A., Gunther, F., Bellicoso, C.D., Tsounis, V., Fankhauser, P., Diethelm, R., Bachmann, S., Bloesch, M., Kolvenbach, H., Bjelonic, M., Isler, L., Meyer, K.: Anymal - toward legged robots for harsh environments. Advanced Robotics. 31(17), 918–931 (2017). https://doi.org/10.1080/01691864.2017.1378591
Graves, A., Mohamed, A., Hinton, G.E.: Speech recognition with deep recurrent neural networks. CoRR abs/1303.5778 (2013). URL http://arxiv.org/abs/1303.5778
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Manoonpong, P., Parlitz, U., Wörgötter, F.: Neural control and adaptive neural forward models for insect-like, energy-e_cient, and adaptable locomotion of walking machines. Frontiers in neural circuits. 7, 12 (2013). https://doi.org/10.3389/fncir.2013.00012
Ross, S., Gordon, G.J., Bagnell, J.A.: No-regret reductions for imitation learning and structured prediction. CoRR abs/1011.0686 (2010). URL http://arxiv.org/abs/1011.0686
Pecka, M., Zimmermann, K., Reinstein, M., Svoboda, T.: Controlling robot morphology from incomplete measurements. CoRR abs/1612.02739 (2016). URL http://arxiv.org/abs/1612.02739
Todorov, E., Erez, T., Tassa, Y.: Mujoco: A physics engine for model-based control. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems pp. 5026-5033 (2012)
Čížek, P., Faigl, J.: On locomotion control using position feedback only in traversing rough terrains with hexapod crawling robot. IOP Conference Series: Materials Science and Engineering. 428, 012065 (2018). https://doi.org/10.1088/1757-899X/428/1/012065
Sanchez-Gonzalez, A., Heess, N., Springenberg, J.T., Merel, J., Riedmiller, M.A., Hadsell, R., Battaglia, P.: Graph networks as learnable physics engines for inference and control. CoRR abs/1806.01242 (2018). URL http://arxiv.org/abs/1806.01242
Saranli, U.: Rhex: A simple and highly mobile hexapod robot. The International Journal of Robotics Research. 20, 616–631 (2001). https://doi.org/10.1177/02783640122067570
Xie, Z., Berseth, G., Clary, P., Hurst, J.W., van de Panne, M.: Feedback control for cassie with deep reinforcement learning. CoRR abs/1803.05580 (2018). URL http: //arxiv.org/abs/1803.05580
Bitter lesson, rich sutton. http://www.incompleteideas.net/IncIdeas/BitterLesson. html. Accessed: 2019-04-18
The research leading to these results has received funding from the Czech Science Foundation under Project 17-08842S.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
(MP4 22534 kb)
Rights and permissions
About this article
Cite this article
Azayev, T., Zimmerman, K. Blind Hexapod Locomotion in Complex Terrain with Gait Adaptation Using Deep Reinforcement Learning and Classification. J Intell Robot Syst 99, 659–671 (2020). https://doi.org/10.1007/s10846-020-01162-8
Issue Date:
DOI: https://doi.org/10.1007/s10846-020-01162-8