Abstract
In the present investigation, we propose an architecture which we name as Generative Inception Neural Network (GI-NNet), capable of predicting antipodal robotic grasps intelligently, on seen as well as unseen objects. It is trained on Cornell Grasping Dataset (CGD) and attains a 98.87% grasp pose accuracy for detecting both regular/irregular shaped objects from RGB-Depth images while requiring only one-third of the network trainable parameters as compared to the existing approaches. However, to attain this level of performance the model requires the entire 90% of the available labelled data of CGD keeping only 10% labelled data for testing which makes it vulnerable to poor generalization. Furthermore, getting a sufficient and quality labelled dataset for robot grasping is extremely difficult. To address these issues, we subsequently propose another architecture where our proposed GI-NNet model is attached as a decoder of a Vector Quantized Variational Auto-Encoder (VQ-VAE), which works more efficiently when trained both with the available labelled and unlabelled data. The proposed model, which we name as Representation based GI-NNet (RGI-NNet) has been trained utilizing the various split of available CGD dataset to test the learning ability of our architecture starting from only 10% label data with the latent embedding of VQ-VAE to 90% label data with the latent embedding. However, being trained with only 50% label data of CGD with latent embedding, the proposed architecture produces the best results which, we believe, is a remarkable accomplishment. The logical reasoning of this together with the other relevant technological details have been elaborated in this paper. The performance level, in terms of grasp pose accuracy of RGI-NNet, varies between 92.1348% to 97.7528% which is far better than several existing models trained with only labelled dataset. For the performance verification of both the proposed models, GI-NNet and RGI-NNet, we have performed rigorous experiments on Anukul (Baxter) hardware cobot.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agarap AF (2018) Deep learning using rectified linear units (relu). arXiv:1803.08375
Ahmed MU, Brickman S, Dengg A, Fasth N, Mihajlovic M, Norman J (2020) A machine learning approach to classify pedestrians’ event based on imu and gps. Int J Artif Intell 16(2). http://www.es.mdh.se/publications/5255-
Asif U, Tang J, Harrer S (2018) Ensemblenet: Improving grasp detection using an ensemble of convolutional neural networks. In: BMVC. p 10
Asif U, Tang J, Harrer S (2018) Graspnet: An efficient convolutional neural network for real-time grasp detection for low-powered devices. In: IJCAI. vol 7, pp 4875–4882
Bicchi A, Kumar V (2000) Robotic grasping and contact: A review. In: Proceedings 2000 ICRA. Millennium conference. IEEE international conference on robotics and automation. symposia proceedings (Cat. No. 00CH37065), vol 1. IEEE, pp 348–353
Bohg J, Morales A, Asfour T, Kragic D (2014) Data-driven grasp synthesis—a survey. IEEE Trans Robot 30(2):289–309. https://doi.org/10.1109/TRO.2013.2289018
Goodfellow IJ, Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge. http://www.deeplearningbook.org
Guo D, Sun F, Liu H, Kong T, Fang B, Xi N (2017) A hybrid deep architecture for robotic grasp detection. In: 2017 IEEE International conference on robotics and automation (ICRA). IEEE, pp 1609–1614
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision. pp 1026–1034
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778
Huber PJ (2011) Robust statistics. Springer, Berlin, pp 1248–1251. https://doi.org/10.1007/978-3-642-04898-2_594
Intel: Intel realsense - d435. Available online: https://www.intelrealsense.com/depth-camera-d435/
Jiang Y, Moseson S, Saxena A (2011) Efficient grasping from rgbd images: Learning using a new rectangle representation. In: 2011 IEEE International conference on robotics and automation. IEEE, pp 3304–3311
Karaoguz H, Jensfelt P (2019) Object detection approach for robot grasp detection. In: 2019 International conference on robotics and automation (ICRA). IEEE, pp 4953–4959
Kopicki M, Detry R, Adjigble M, Stolkin R, Leonardis A, Wyatt JL (2016) One-shot learning and generation of dexterous grasps for novel objects. Int J Robot Res 35(8):959–976
Kragic D, Christensen HI (2003) Robust visual servoing. Int J Robot Res 22(10-11):923–939
Kumra S, Joshi S, Sahin F (2020) Antipodal robotic grasping using generative residual convolutional neural network. In: 2020 IEEE/RSJ International conference on intelligent robots and systems (IROS). IEEE
Kumra S, Kanan C (2017) Robotic grasp detection using deep convolutional neural networks. In: 2017 IEEE/RSJ International conference on intelligent robots and systems (IROS). IEEE, pp 769–776
Lenz I, Lee H, Saxena A (2015) Deep learning for detecting robotic grasps. Int J Robot Res 34(4-5):705–724
Mahajan M, Bhattacharjee T, Krishnan A, Shukla P, Nandi GC (2020) Robotic grasp detection by learning representation in a vector quantized manifold. In: 2020 International conference on signal processing and communications (SPCOM). pp 1–5. https://doi.org/10.1109/SPCOM50965.2020.9179578
Maitin-Shepard J, Cusumano-Towner M, Lei J, Abbeel P (2010) Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding. In: 2010 IEEE International conference on robotics and automation. IEEE, pp 2308–2315
Morrison D, Corke P, Leitner J (2018) Closing the loop for robotic grasping: A real-time, generative grasp synthesis approach. In: Proc. of robotics: science and systems (RSS)
Pinto L, Gupta A (2016) Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. In: 2016 IEEE international conference on robotics and automation (ICRA). IEEE, pp 3406–3413
Precup RE, Teban TA, Albu A, Borlea AB, Zamfirache IA, Petriu EM (2020) Evolving fuzzy models for prosthetic hand myoelectric-based control. IEEE Trans Instrum Meas 69(7):4625–4636. https://doi.org/10.1109/TIM.2020.2983531
Redmon J, Angelova A (2015) Real-time grasp detection using convolutional neural networks. In: 2015 IEEE International conference on robotics and automation (ICRA). IEEE, pp 1316–1322
Ritter H, Haschke R (2015) Hands, dexterity, and the brain. In: Cheng G PhD (ed) Humanoid robotics and neuroscience: science, engineering and society. Boca Raton (FL): CRC Press/Taylor & Francis. Chapter 3. https://www.ncbi.nlm.nih.gov/books/NBK299038/
Sahbani A, El-Khoury S, Bidaud P (2012) An overview of 3d object grasp synthesis algorithms. Robot Auton Syst 60(3):326–336
Satish V, Mahler J, Goldberg K (2019) On-policy dataset synthesis for learning robot grasping policies using fully convolutional deep networks. IEEE Robot Autom Lett 4(2):1357– 1364
Saxena A, Driemeyer J, Ng AY (2008) Robotic grasping of novel objects using vision. Int J Robot Res 27(2):157–173
Schmidt P, Vahrenkamp N, Wächter M., Asfour T (2018) Grasping of unknown objects using deep convolutional neural networks based on depth images. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 6831–6838
Shimoga KB (1996) Robot grasp synthesis algorithms: A survey. Int J Robot Res 15(3):230–266
Shukla P, Kumar H, Nandi G (2021) Robotic grasp manipulation using evolutionary computing and deep reinforcement learning. Intell Serv Robot 1–17
Strobl KH, Hirzinger G (2006) Optimal hand-eye calibration. In: 2006 IEEE/RSJ International conference on intelligent robots and systems. pp 4647–4653. https://doi.org/10.1109/IROS.2006.282250
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Telljohann A (2017) Introduction to Building a Machine Vision Inspection, chap. 2. Wiley, New York, pp 31–61. https://doi.org/10.1002/9783527413409.ch2. https://onlinelibrary.wiley.com/doi/abs/10.1002/9783527413409.ch2
Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3156–3164
Wang Z, Li Z, Wang B, Liu H (2016) Robot grasp detection using multimodal deep convolutional neural networks. Adv Mech Eng 8(9):1687814016668077
Zeng A, Song S, Yu KT, Donlon E, Hogan FR, Bauza M, Ma D, Taylor O, Liu M, Romo E et al (2018) Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 3750–3757
Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2016) Understanding deep learning requires rethinking generalization. arXiv:1611.03530
Zhou X, Lan X, Zhang H, Tian Z, Zhang Y, Zheng N (2018) Fully convolutional grasp detection network with oriented anchor box. In: 2018 IEEE/RSJ International conference on intelligent robots and systems (IROS). IEEE, pp 7223–7230
Acknowledgements
The present research is partially funded by the I-Hub foundation for Cobotics (Technology Innovation Hub of IIT-Delhi set up by the Department of Science and Technology, Govt. of India).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Shukla, P., Pramanik, N., Mehta, D. et al. Generative model based robotic grasp pose prediction with limited dataset. Appl Intell 52, 9952–9966 (2022). https://doi.org/10.1007/s10489-021-03011-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-03011-z