Abstract
Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent’s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Andrychowicz, M., et al.: Hindsight experience replay. In: Neural Information Processing Systems (2017)
Andrychowicz, O.M., et al.: Learning dexterous in-hand manipulation. Int. J. Robot. Res. 39(1), 3–20 (2020)
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26–38 (2017)
Berner, C., et al.: Dota 2 with large scale deep reinforcement learning. arXiv:1912.06680 (2019)
Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Dai, T., Liu, H., Bharath, A.A.: Episodic self-imitation learning with hindsight. Electronics 9(10), 1742 (2020)
Fang, M., Zhou, C., Shi, B., Gong, B., Xu, J., Zhang, T.: DHER: Hindsight experience replay for dynamic goals. In: International Conference on Learning Representations (2018)
Fang, M., Zhou, T., Du, Y., Han, L., Zhang, Z.: Curriculum-guided hindsight experience replay. In: Neural Information Processing Systems (2019)
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: International Conference on Robotics and Automation (2017)
Hong, K., Nenkova, A.: Improving the estimation of word importance for news multi-document summarization. In: Conference of the European Chapter of the Association for Computational Linguistics (2014)
Kaelbling, L.P.: Learning to achieve goals. In: International Joint Conference on Artificial Intelligence (1993)
Kiran, B.R., et al.: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst. 1–18 (2021)
Kulesza, A., Taskar, B.: k-DPPs: fixed-size determinantal point processes. In: International Conference on Machine Learning (2011)
Kulesza, A., et al.: Determinantal Point Processes for Machine Learning. Foundations and Trends in Machine Learning (2012)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17, 1–40 (2016)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. In: International Conference on Learning Representations (2016)
Liu, H., Trott, A., Socher, R., Xiong, C.: Competitive experience replay. In: International Conference on Learning Representations (2019)
Mahasseni, B., Lam, M., Todorovic, S.: Unsupervised video summarization with adversarial lstm networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Ng, A.Y., Harada, D., Russell, S.: Theory and application to reward shaping. In: International Conference on Machine Learning (1999)
Osogami, T., Raymond, R.: Determinantal reinforcement learning. In: AAAI Conference on Artificial Intelligence (2019)
Parker-Holder, J., Pacchiano, A., Choromanski, K.M., Roberts, S.J.: Effective diversity in population based reinforcement learning. In: Neural Information Processing Systems (2020)
Plappert, M., et al.: Multi-goal reinforcement learning: challenging robotics environments and request for research. arXiv:1802.09464 (2018)
Rauber, P., Ummadisingu, A., Mutz, F., Schmidhuber, J.: Hindsight policy gradients. In: International Conference on Learning Representations (2019)
Schaul, T., Horgan, D., Gregor, K., Silver, D.: Universal value function approximators. In: International Conference on Machine Learning (2015)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: International Conference on Learning Representations (2016)
Schrittwieser, J., et al.: Mastering Atari, go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020)
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (2018)
Vinyals, O., et al.: Grandmaster level in Starcraft ii using multi-agent reinforcement learning. Nature 575, 350–354 (2019)
Yang, Y., et al.: Multi-agent determinantal q-learning. In: International Conference on Machine Learning (2020)
Zhao, R., Tresp, V.: Energy-based hindsight experience prioritization. In: Conference on Robot Learning (2018)
Acknowledgements
This work was supported by JST, Moonshot R&D Grant Number JPMJMS2012.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Dai, T., Liu, H., Arulkumaran, K., Ren, G., Bharath, A.A. (2021). Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay. In: Pham, D.N., Theeramunkong, T., Governatori, G., Liu, F. (eds) PRICAI 2021: Trends in Artificial Intelligence. PRICAI 2021. Lecture Notes in Computer Science(), vol 13033. Springer, Cham. https://doi.org/10.1007/978-3-030-89370-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-89370-5_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89369-9
Online ISBN: 978-3-030-89370-5
eBook Packages: Computer ScienceComputer Science (R0)