Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method

Fei Wang¹,
Ben Cui¹,
Yue Liu¹ &
…
Baiming Ren²

464 Accesses
5 Citations
Explore all metrics

Abstract

Deep reinforcement learning has been widely studied in many fields of robotics. However, the application of the algorithm is seriously restricted by its low convergence efficiency. Although demonstration information can effectively improve the convergence speed, relying too much on demonstration information will reduce the training effect in the real environment and make the convergence effect worse. In addition, historical information should also be considered, as it will affect the utilization efficiency of information and convergence effect of the algorithm. However, there are few studies on this part at present. This paper proposes an improved reinforcement learning algorithm, which introduces the demonstration information utilization mechanism and LSTM network based on the Proximal Policy Optimization algorithm(PPO). Demonstration information is introduced to provide a priori knowledge base for robots, and a utilization mechanism for demonstration information is established to balance the utilization of teaching information and interactive information. So that the data efficiency can be improved. In addition, we reconstruct the network structure in deep reinforcement learning to introduce historical information. Experimental results show that the method is feasible. Compared with the existing solutions, our method significantly improves the convergence effect of robot autonomous learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Deep Reinforcement Learning Algorithm for Intelligent Manipulation

Active compliance control of robot peg-in-hole assembly based on combined reinforcement learning

Article 23 November 2023

Model accelerated reinforcement learning for high precision robotic assembly

Article 02 June 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

Availability of data is temporarily not allowed by the authors.

Code availability

Code availability is temporarily not allowed by the authors.

References

Torrado, R.R., Bontrager, P., Togelius, J., Liu, J., Perez-Liebana, D.: Deep reinforcement learning for general video game ai. In: 2018 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2018)
Kahn, G., Villaflor, A., Ding, B., Abbeel, P., Levine, S.: Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–8. IEEE (2018)
Luo, J., Edmunds, R., Rice, F., Agogino, A.M.: Tensegrity robot locomotion under limited sensory inputs via deep reinforcement learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6260–6267. IEEE (2018)
Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE (2016)
Beltran-Hernandez, C.C., Petit, D., Ramirez-Alpizar, I.G., Nishi, T., Kikuchi, S., Matsubara, T., Harada, K.: Learning force control for contact-rich manipulation tasks with rigid position-controlled robots. IEEE Robotics and Automation Letters 5(4), 5709–5716 (2020)
Article Google Scholar
Perrusquía, A., Yu, W., Soria, A.: Position/force control of robot manipulators using reinforcement learning. Industrial Robot: the international journal of robotics research and application (2019)
Ghadirzadeh, A., Bütepage, J., Maki, A., Kragic, D., Björkman, M.: A sensorimotor reinforcement learning framework for physical human-robot interaction. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2682–2688. IEEE (2016)
Luo, J., Solowjow, E., Wen, C., Ojea, J.A., Agogino, A.M.: Deep reinforcement learning for robotic assembly of mixed deformable and rigid objects. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2062–2069. IEEE (2018)
Luo, J., Solowjow, E., Wen, C., Ojea, J.A., Agogino, A.M., Tamar, A., Abbeel, P.: Reinforcement learning on variable impedance controller for high-precision robotic assembly. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3080–3087. IEEE (2019)
Nair, A.V., Pong, V., Dalal, M., Bahl, S., Lin, S., Levine, S.: Visual reinforcement learning with imagined goals. Advances in Neural Information Processing Systems 31, 9191–9200 (2018)
Google Scholar
Zeng, A., Song, S., Welker, S., Lee, J., Rodriguez, A., Funkhouser, T.: Learning synergies between pushing and grasping with self-supervised deep reinforcement learning. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4238–4245. IEEE (2018)
Wu, B., Akinola, I., Allen, P.K.: Pixel-attentive policy gradient for multi-fingered grasping in cluttered scenes. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1789–1796. IEEE (2019)
Vecerik, M., Hester, T., Scholz, J., Wang, F., Pietquin, O., Piot, B., Heess, N., Rothörl, T., Lampe, T., Riedmiller, M.: Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. arXiv preprint arXiv:1707.08817 (2017)
Zuo, G., Zhao, Q., Lu, J., Li, J.: Efficient hindsight reinforcement learning using demonstrations for robotic tasks with sparse rewards. International Journal of Advanced Robotic Systems 17(1), 1729881419898342 (2020)
Article Google Scholar
Zhang, H., Zhou, A., Lin, X.: Interpretable policy derivation for reinforcement learning based on evolutionary feature synthesis. Complex & Intelligent Systems 6(3), 741–753 (2020)
Article Google Scholar
Park, H., Bae, J.-H., Park, J.-H., Baeg, M.-H., Park, J.: Intuitive peg-in-hole assembly strategy with a compliant manipulator. In: IEEE ISR 2013, pp. 1–5. IEEE (2013)
Li, R., Platt, R., Yuan, W., ten Pas, A., Roscup, N., Srinivasan, M.A., Adelson, E.: Localization and manipulation of small parts using gelsight tactile sensing. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3988–3993. IEEE (2014)
Tamar, A., Thomas, G., Zhang, T., Levine, S., Abbeel, P.: Learning from the hindsight plan—episodic mpc improvement. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 336–343. IEEE (2017)
Sun, B.-q., Wang, L.: An estimation of distribution algorithm with branch-and-bound based knowledge for robotic assembly line balancing. Complex & Intelligent Systems, 1–14 (2020)
Xu, J., Hou, Z., Wang, W., Xu, B., Zhang, K., Chen, K.: Feedback deep deterministic policy gradient with fuzzy reward for robotic multiple peg-in-hole assembly tasks. IEEE Transactions on Industrial Informatics 15(3), 1658–1667 (2018)
Article Google Scholar
Breyer, M., Furrer, F., Novkovic, T., Siegwart, R., Nieto, J.: Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning. IEEE Robotics and Automation Letters 4(2), 1549–1556 (2019)
Article Google Scholar
Viereck, U., Pas, A., Saenko, K., Platt, R.: Learning a visuomotor controller for real world robotic grasping using simulated depth images. In: Conference on Robot Learning, pp. 291–300. PMLR (2017)
Rusu, A.A., Večerík, M., Rothörl, T., Heess, N., Pascanu, R., Hadsell, R.: Sim-to-real robot learning from pixels with progressive nets. In: Conference on Robot Learning, pp. 262–270. PMLR (2017)
Barreto, A., Borsa, D., Quan, J., Schaul, T., Silver, D., Hessel, M., Mankowitz, D., Zidek, A., Munos, R.: Transfer in deep reinforcement learning using successor features and generalised policy improvement. In: International Conference on Machine Learning, pp. 501–510. PMLR (2018)
Amiranashvili, A., Dosovitskiy, A., Koltun, V., Brox, T.: Motion perception in reinforcement learning with dynamic objects. In: Conference on Robot Learning, pp. 156–168. PMLR (2018)
Wang, F., Zhou, X., Wang, J., Zhang, X., He, Z., Song, B.: Joining force of human muscular task planning with robot robust and delicate manipulation for programming by demonstration. IEEE/ASME Transactions on Mechatronics (2020)
Nair, A., McGrew, B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Overcoming exploration in reinforcement learning with demonstrations. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6292–6299. IEEE (2018)
Lafleche, J.-F., Saunderson, S., Nejat, G.: Robot cooperative behavior learning using single-shot learning from demonstration and parallel hidden markov models. IEEE Robotics and Automation Letters 4(2), 193–200 (2018)
Article Google Scholar
Tutsoy, O., Brown, M.: Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control. Optimal Control Applications and Methods 37(1), 108–126 (2016)
Article MathSciNet Google Scholar
Tutsoy, O., Barkana, D.E., Balikci, K.: A novel exploration-exploitation-based adaptive law for intelligent model-free control approaches. IEEE Transactions on Cybernetics (2021)
Tsurumine, Y., Cui, Y., Uchibe, E., Matsubara, T.: Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation. Robotics and Autonomous Systems 112, 72–83 (2019)
Article Google Scholar
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision, pp. 850–865. Springer (2016)
Liu, S., Liu, D., Srivastava, G., Połap, D., Woźniak, M.: Overview and methods of correlation filter algorithms in object tracking. Complex & Intelligent Systems, 1–23 (2020)

Download references

Funding

The Foundation of National Natural Science Foundation of China under Grant: 61973065, 52075531. the Fundamental Research Funds for the Central Universities of China under Grant: N2104008. the Central Government Guides the Local Science and Technology Development Special Fund: 2021JH6/10500129.Innovative Talents Support Program of Liaoning Provincial Universities under LR2020047.

Author information

Authors and Affiliations

Faculty of Robot Science and Engineering, Northeastern University, 110169, Shenyang, China
Fei Wang, Ben Cui & Yue Liu
College of Information Science and Engineering, Northeastern University, Northeastern University, 110819, Shenyang, China
Baiming Ren

Authors

Fei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ben Cui
View author publications
You can also search for this author in PubMed Google Scholar
Yue Liu
View author publications
You can also search for this author in PubMed Google Scholar
Baiming Ren
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Fei Wang and Ben Cui conceived the project.Ben Cui and Yue Liu conducted experiments in simulation environment and collected the test data.Fei Wang and Baiming Ren completed the real-world part of the experiment.Fei Wang and Ben Cui analyzed the data and wrote the manuscript.Yue Liu and Baiming Ren provided valuable comments.All authors read and approved the final manuscript.

Corresponding author

Correspondence to Fei Wang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Ethics approval

This article does not contain any studies with human participants performed by any of the authors.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Consent for publication

The authors declare that they consent to publication.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, F., Cui, B., Liu, Y. et al. Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method. J Intell Robot Syst 106, 16 (2022). https://doi.org/10.1007/s10846-022-01713-1

Download citation

Received: 12 October 2021
Accepted: 15 July 2022
Published: 30 August 2022
DOI: https://doi.org/10.1007/s10846-022-01713-1

Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Hybrid Deep Reinforcement Learning Algorithm for Intelligent Manipulation

Active compliance control of robot peg-in-hole assembly based on combined reinforcement learning

Model accelerated reinforcement learning for high precision robotic assembly

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Hybrid Deep Reinforcement Learning Algorithm for Intelligent Manipulation

Active compliance control of robot peg-in-hole assembly based on combined reinforcement learning

Model accelerated reinforcement learning for high precision robotic assembly

Explore related subjects

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation