Abstract
For the grasp task in physical environment, it is important for the manipulator to know the objects’ spatial positions with as few sensors as possible in real time. This work proposed an effective framework to organize the objects’ spatial positions in the manipulator 3D workspace with a single RGB-D camera robustly and fast. It mainly contains two steps: (1) a 3D reconstruction strategy for objects’ contours obtained in environment; (2) a distance-restricted outlier point elimination strategy to reduce the reconstruction errors caused by sensor noise. The first step ensures fast object extraction and 3D reconstruction from scene image, and the second step contributes to more accurate reconstructions by eliminating outlier points from initial result obtained by the first step. We validated the proposed method in a physical system containing a Kinect 2.0 RGB-D camera and a Mico2 robot. Experiments show that the proposed method can run in quasi real time on a common PC and it outperforms the traditional 3D reconstruction methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Boby, R.A., Saha, S.K.: Single image based camera calibration and pose estimation of the end-effector of a robot. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 2435–2440. IEEE (2016)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934
Brachmann, E., Michel, F., Krull, A., Yang, M.Y., Gumhold, S., et al.: Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3364–3372 (2016)
Cao, Z., Sheikh, Y., Banerjee, N.K.: Real-time scalable 6D of pose estimation for textureless objects. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 2441–2448. IEEE (2016)
Collet, A., Martinez, M., Srinivasa, S.S.: The moped framework: object recognition and pose estimation for manipulation. Int. J. Rob. Res. 30(10), 1284–1306 (2011)
Durović, P., Grbić, R., Cupec, R.: Visual servoing for low-cost scara robots using an rgb-d camera as the only sensor. Automatika: časopis za automatiku, mjerenje, elektroniku, računarstvo i komunikacije 58(4), 495–505 (2017)
Gao, G., Lauri, M., Wang, Y., Hu, X., Zhang, J., Frintrop, S.: 6D object pose regression via supervised learning on point clouds. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3643–3649. IEEE (2020)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Jones, M., Vernon, D.: Using neural networks to learn hand-eye co-ordination. Neural Comput. Appl. 2(1), 2–12 (1994)
Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: Making rgb-based 3D detection and 6D pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1521–1529 (2017)
Kehl, W., Milletari, F., Tombari, F., Ilic, S., Navab, N.: Deep learning of local RGB-D patches for 3D object detection and 6D pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 205–220. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_13
Kuan, Y.W., Ee, N.O., Wei, L.S.: Comparative study of intel r200, kinect v2, and primesense rgb-d sensors performance outdoors. IEEE Sens. J. 19(19), 8741–8750 (2019)
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Rob. Res. 37(4–5), 421–436 (2018)
Li, E., Mo, H., Xu, D., Li, H.: Image projective invariants. IEEE Trans. Pattern Anal. Mach. Intell. 41(5), 1144–1157 (2018)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Meng, Y., Zhuang, H.: Self-calibration of camera-equipped robot manipulators. Int. J. Rob. Res. 20(11), 909–921 (2001)
Michel, F., et al.: Global hypothesis generation for 6D object pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 462–471 (2017)
Mindspore: Mask-rcnn-mobilenetv1. Website (2020). https://gitee.com/mindspore/mindspore/blob/r1.1/model_zoo/official/cv/maskrcnn_mobilenetv1/src/maskrcnn_mobilenetv1/mobilenetv1.py
Pavlakos, G., Zhou, X., Chan, A., Derpanis, K.G., Daniilidis, K.: 6-dof object pose from semantic keypoints. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2011–2018. IEEE (2017)
Rad, M., Lepetit, V.: Bb8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3836 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint arXiv:1804.02767
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 91–99 (2015)
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Schmid, C., Mohr, R.: Local grayvalue invariants for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 530–535 (1997)
Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301 (2018)
Wang, C., Xu, D., Zhu, Y., Martín-Martín, R., Lu, C., Fei-Fei, L., Savarese, S.: Densefusion: 6D object pose estimation by iterative dense fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3343–3352 (2019)
Wohlhart, P., Lepetit, V.: Learning descriptors for object recognition and 3D pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3109–3118 (2015)
Wu, H., Tizzano, W., Andersen, T.T., Andersen, N.A., Ravn, O.: Hand-eye calibration and inverse kinematics of robot arm using neural network. In: Kim, J.-H., Matson, E.T., Myung, H., Xu, P., Karray, F. (eds.) Robot Intelligence Technology and Applications 2. AISC, vol. 274, pp. 581–591. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05582-4_50
Zeng, A., et al.: Multi-view self-supervised deep learning for 6D pose estimation in the amazon picking challenge. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1386–1383. IEEE (2017)
Acknowledgments
This work is supported by the National Key Research & Development Program of China (No. 2018AAA0102902), the National Natural Science Foundation of China (NSFC) (No.61873269), the Beijing Natural Science Foundation (No: L192005), the CAAI-Huawei MindSpore Open Fund (CAAIXSJLJJ-20202-027A), the Guangxi Key Research and Development Program (AB18221011, AB21075004, AD18281002, AD19110137), the Natural Science Foundation of Guangxi of China (No: 2020GXNSFAA297061, 2019GXNSFDA185006, 2019GXN SFDA185007), Guangxi Key Laboratory of Intelligent Processing of Computer Images and Graphics (No GIIP201702) and Guangxi Key Laboratory of Trusted Software (NO kx201621,kx201715).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Sun, Y., Yang, M., Li, J., Qiang, B., Chen, J., Jia, Q. (2021). Fast Organization of Objects’ Spatial Positions in Manipulator Space from Single RGB-D Camera. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13110. Springer, Cham. https://doi.org/10.1007/978-3-030-92238-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-92238-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92237-5
Online ISBN: 978-3-030-92238-2
eBook Packages: Computer ScienceComputer Science (R0)