[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Deep reinforcement learning has been widely studied in many fields of robotics. However, the application of the algorithm is seriously restricted by its low convergence efficiency. Although demonstration information can effectively improve the convergence speed, relying too much on demonstration information will reduce the training effect in the real environment and make the convergence effect worse. In addition, historical information should also be considered, as it will affect the utilization efficiency of information and convergence effect of the algorithm. However, there are few studies on this part at present. This paper proposes an improved reinforcement learning algorithm, which introduces the demonstration information utilization mechanism and LSTM network based on the Proximal Policy Optimization algorithm(PPO). Demonstration information is introduced to provide a priori knowledge base for robots, and a utilization mechanism for demonstration information is established to balance the utilization of teaching information and interactive information. So that the data efficiency can be improved. In addition, we reconstruct the network structure in deep reinforcement learning to introduce historical information. Experimental results show that the method is feasible. Compared with the existing solutions, our method significantly improves the convergence effect of robot autonomous learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

Availability of data is temporarily not allowed by the authors.

Code availability

Code availability is temporarily not allowed by the authors.

References

  1. Torrado, R.R., Bontrager, P., Togelius, J., Liu, J., Perez-Liebana, D.: Deep reinforcement learning for general video game ai. In: 2018 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2018)

  2. Kahn, G., Villaflor, A., Ding, B., Abbeel, P., Levine, S.: Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–8. IEEE (2018)

  3. Luo, J., Edmunds, R., Rice, F., Agogino, A.M.: Tensegrity robot locomotion under limited sensory inputs via deep reinforcement learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6260–6267. IEEE (2018)

  4. Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE (2016)

  5. Beltran-Hernandez, C.C., Petit, D., Ramirez-Alpizar, I.G., Nishi, T., Kikuchi, S., Matsubara, T., Harada, K.: Learning force control for contact-rich manipulation tasks with rigid position-controlled robots. IEEE Robotics and Automation Letters 5(4), 5709–5716 (2020)

    Article  Google Scholar 

  6. Perrusquía, A., Yu, W., Soria, A.: Position/force control of robot manipulators using reinforcement learning. Industrial Robot: the international journal of robotics research and application (2019)

  7. Ghadirzadeh, A., Bütepage, J., Maki, A., Kragic, D., Björkman, M.: A sensorimotor reinforcement learning framework for physical human-robot interaction. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2682–2688. IEEE (2016)

  8. Luo, J., Solowjow, E., Wen, C., Ojea, J.A., Agogino, A.M.: Deep reinforcement learning for robotic assembly of mixed deformable and rigid objects. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2062–2069. IEEE (2018)

  9. Luo, J., Solowjow, E., Wen, C., Ojea, J.A., Agogino, A.M., Tamar, A., Abbeel, P.: Reinforcement learning on variable impedance controller for high-precision robotic assembly. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3080–3087. IEEE (2019)

  10. Nair, A.V., Pong, V., Dalal, M., Bahl, S., Lin, S., Levine, S.: Visual reinforcement learning with imagined goals. Advances in Neural Information Processing Systems 31, 9191–9200 (2018)

    Google Scholar 

  11. Zeng, A., Song, S., Welker, S., Lee, J., Rodriguez, A., Funkhouser, T.: Learning synergies between pushing and grasping with self-supervised deep reinforcement learning. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4238–4245. IEEE (2018)

  12. Wu, B., Akinola, I., Allen, P.K.: Pixel-attentive policy gradient for multi-fingered grasping in cluttered scenes. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1789–1796. IEEE (2019)

  13. Vecerik, M., Hester, T., Scholz, J., Wang, F., Pietquin, O., Piot, B., Heess, N., Rothörl, T., Lampe, T., Riedmiller, M.: Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. arXiv preprint arXiv:1707.08817 (2017)

  14. Zuo, G., Zhao, Q., Lu, J., Li, J.: Efficient hindsight reinforcement learning using demonstrations for robotic tasks with sparse rewards. International Journal of Advanced Robotic Systems 17(1), 1729881419898342 (2020)

    Article  Google Scholar 

  15. Zhang, H., Zhou, A., Lin, X.: Interpretable policy derivation for reinforcement learning based on evolutionary feature synthesis. Complex & Intelligent Systems 6(3), 741–753 (2020)

    Article  Google Scholar 

  16. Park, H., Bae, J.-H., Park, J.-H., Baeg, M.-H., Park, J.: Intuitive peg-in-hole assembly strategy with a compliant manipulator. In: IEEE ISR 2013, pp. 1–5. IEEE (2013)

  17. Li, R., Platt, R., Yuan, W., ten Pas, A., Roscup, N., Srinivasan, M.A., Adelson, E.: Localization and manipulation of small parts using gelsight tactile sensing. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3988–3993. IEEE (2014)

  18. Tamar, A., Thomas, G., Zhang, T., Levine, S., Abbeel, P.: Learning from the hindsight plan—episodic mpc improvement. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 336–343. IEEE (2017)

  19. Sun, B.-q., Wang, L.: An estimation of distribution algorithm with branch-and-bound based knowledge for robotic assembly line balancing. Complex & Intelligent Systems, 1–14 (2020)

  20. Xu, J., Hou, Z., Wang, W., Xu, B., Zhang, K., Chen, K.: Feedback deep deterministic policy gradient with fuzzy reward for robotic multiple peg-in-hole assembly tasks. IEEE Transactions on Industrial Informatics 15(3), 1658–1667 (2018)

    Article  Google Scholar 

  21. Breyer, M., Furrer, F., Novkovic, T., Siegwart, R., Nieto, J.: Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning. IEEE Robotics and Automation Letters 4(2), 1549–1556 (2019)

    Article  Google Scholar 

  22. Viereck, U., Pas, A., Saenko, K., Platt, R.: Learning a visuomotor controller for real world robotic grasping using simulated depth images. In: Conference on Robot Learning, pp. 291–300. PMLR (2017)

  23. Rusu, A.A., Večerík, M., Rothörl, T., Heess, N., Pascanu, R., Hadsell, R.: Sim-to-real robot learning from pixels with progressive nets. In: Conference on Robot Learning, pp. 262–270. PMLR (2017)

  24. Barreto, A., Borsa, D., Quan, J., Schaul, T., Silver, D., Hessel, M., Mankowitz, D., Zidek, A., Munos, R.: Transfer in deep reinforcement learning using successor features and generalised policy improvement. In: International Conference on Machine Learning, pp. 501–510. PMLR (2018)

  25. Amiranashvili, A., Dosovitskiy, A., Koltun, V., Brox, T.: Motion perception in reinforcement learning with dynamic objects. In: Conference on Robot Learning, pp. 156–168. PMLR (2018)

  26. Wang, F., Zhou, X., Wang, J., Zhang, X., He, Z., Song, B.: Joining force of human muscular task planning with robot robust and delicate manipulation for programming by demonstration. IEEE/ASME Transactions on Mechatronics (2020)

  27. Nair, A., McGrew, B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Overcoming exploration in reinforcement learning with demonstrations. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6292–6299. IEEE (2018)

  28. Lafleche, J.-F., Saunderson, S., Nejat, G.: Robot cooperative behavior learning using single-shot learning from demonstration and parallel hidden markov models. IEEE Robotics and Automation Letters 4(2), 193–200 (2018)

    Article  Google Scholar 

  29. Tutsoy, O., Brown, M.: Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control. Optimal Control Applications and Methods 37(1), 108–126 (2016)

    Article  MathSciNet  Google Scholar 

  30. Tutsoy, O., Barkana, D.E., Balikci, K.: A novel exploration-exploitation-based adaptive law for intelligent model-free control approaches. IEEE Transactions on Cybernetics (2021)

  31. Tsurumine, Y., Cui, Y., Uchibe, E., Matsubara, T.: Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation. Robotics and Autonomous Systems 112, 72–83 (2019)

    Article  Google Scholar 

  32. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision, pp. 850–865. Springer (2016)

  33. Liu, S., Liu, D., Srivastava, G., Połap, D., Woźniak, M.: Overview and methods of correlation filter algorithms in object tracking. Complex & Intelligent Systems, 1–23 (2020)

Download references

Funding

The Foundation of National Natural Science Foundation of China under Grant: 61973065, 52075531. the Fundamental Research Funds for the Central Universities of China under Grant: N2104008. the Central Government Guides the Local Science and Technology Development Special Fund: 2021JH6/10500129.Innovative Talents Support Program of Liaoning Provincial Universities under LR2020047.

Author information

Authors and Affiliations

Authors

Contributions

Fei Wang and Ben Cui conceived the project.Ben Cui and Yue Liu conducted experiments in simulation environment and collected the test data.Fei Wang and Baiming Ren completed the real-world part of the experiment.Fei Wang and Ben Cui analyzed the data and wrote the manuscript.Yue Liu and Baiming Ren provided valuable comments.All authors read and approved the final manuscript.

Corresponding author

Correspondence to Fei Wang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Ethics approval

This article does not contain any studies with human participants performed by any of the authors.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Consent for publication

The authors declare that they consent to publication.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, F., Cui, B., Liu, Y. et al. Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method. J Intell Robot Syst 106, 16 (2022). https://doi.org/10.1007/s10846-022-01713-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-022-01713-1

Keywords

Navigation