[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay

  • Conference paper
  • First Online:
PRICAI 2021: Trends in Artificial Intelligence (PRICAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13033))

Included in the following conference series:

Abstract

Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent’s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 55.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 69.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/openai/baselines.

References

  1. Andrychowicz, M., et al.: Hindsight experience replay. In: Neural Information Processing Systems (2017)

    Google Scholar 

  2. Andrychowicz, O.M., et al.: Learning dexterous in-hand manipulation. Int. J. Robot. Res. 39(1), 3–20 (2020)

    Article  Google Scholar 

  3. Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26–38 (2017)

    Article  Google Scholar 

  4. Berner, C., et al.: Dota 2 with large scale deep reinforcement learning. arXiv:1912.06680 (2019)

  5. Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)

  6. Dai, T., Liu, H., Bharath, A.A.: Episodic self-imitation learning with hindsight. Electronics 9(10), 1742 (2020)

    Article  Google Scholar 

  7. Fang, M., Zhou, C., Shi, B., Gong, B., Xu, J., Zhang, T.: DHER: Hindsight experience replay for dynamic goals. In: International Conference on Learning Representations (2018)

    Google Scholar 

  8. Fang, M., Zhou, T., Du, Y., Han, L., Zhang, Z.: Curriculum-guided hindsight experience replay. In: Neural Information Processing Systems (2019)

    Google Scholar 

  9. Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: International Conference on Robotics and Automation (2017)

    Google Scholar 

  10. Hong, K., Nenkova, A.: Improving the estimation of word importance for news multi-document summarization. In: Conference of the European Chapter of the Association for Computational Linguistics (2014)

    Google Scholar 

  11. Kaelbling, L.P.: Learning to achieve goals. In: International Joint Conference on Artificial Intelligence (1993)

    Google Scholar 

  12. Kiran, B.R., et al.: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst. 1–18 (2021)

    Google Scholar 

  13. Kulesza, A., Taskar, B.: k-DPPs: fixed-size determinantal point processes. In: International Conference on Machine Learning (2011)

    Google Scholar 

  14. Kulesza, A., et al.: Determinantal Point Processes for Machine Learning. Foundations and Trends in Machine Learning (2012)

    Google Scholar 

  15. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17, 1–40 (2016)

    MathSciNet  MATH  Google Scholar 

  16. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. In: International Conference on Learning Representations (2016)

    Google Scholar 

  17. Liu, H., Trott, A., Socher, R., Xiong, C.: Competitive experience replay. In: International Conference on Learning Representations (2019)

    Google Scholar 

  18. Mahasseni, B., Lam, M., Todorovic, S.: Unsupervised video summarization with adversarial lstm networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  19. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)

    Article  Google Scholar 

  20. Ng, A.Y., Harada, D., Russell, S.: Theory and application to reward shaping. In: International Conference on Machine Learning (1999)

    Google Scholar 

  21. Osogami, T., Raymond, R.: Determinantal reinforcement learning. In: AAAI Conference on Artificial Intelligence (2019)

    Google Scholar 

  22. Parker-Holder, J., Pacchiano, A., Choromanski, K.M., Roberts, S.J.: Effective diversity in population based reinforcement learning. In: Neural Information Processing Systems (2020)

    Google Scholar 

  23. Plappert, M., et al.: Multi-goal reinforcement learning: challenging robotics environments and request for research. arXiv:1802.09464 (2018)

  24. Rauber, P., Ummadisingu, A., Mutz, F., Schmidhuber, J.: Hindsight policy gradients. In: International Conference on Learning Representations (2019)

    Google Scholar 

  25. Schaul, T., Horgan, D., Gregor, K., Silver, D.: Universal value function approximators. In: International Conference on Machine Learning (2015)

    Google Scholar 

  26. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: International Conference on Learning Representations (2016)

    Google Scholar 

  27. Schrittwieser, J., et al.: Mastering Atari, go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020)

    Article  Google Scholar 

  28. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)

    Article  Google Scholar 

  29. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  30. Vinyals, O., et al.: Grandmaster level in Starcraft ii using multi-agent reinforcement learning. Nature 575, 350–354 (2019)

    Article  Google Scholar 

  31. Yang, Y., et al.: Multi-agent determinantal q-learning. In: International Conference on Machine Learning (2020)

    Google Scholar 

  32. Zhao, R., Tresp, V.: Energy-based hindsight experience prioritization. In: Conference on Robot Learning (2018)

    Google Scholar 

Download references

Acknowledgements

This work was supported by JST, Moonshot R&D Grant Number JPMJMS2012.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tianhong Dai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dai, T., Liu, H., Arulkumaran, K., Ren, G., Bharath, A.A. (2021). Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay. In: Pham, D.N., Theeramunkong, T., Governatori, G., Liu, F. (eds) PRICAI 2021: Trends in Artificial Intelligence. PRICAI 2021. Lecture Notes in Computer Science(), vol 13033. Springer, Cham. https://doi.org/10.1007/978-3-030-89370-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89370-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89369-9

  • Online ISBN: 978-3-030-89370-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics