Abstract
The 6D (6 Degree of freedom) pose estimation (or pose measurement) of machined reflective texture-less objects, which are common in industry, is a significant but challenging technique. It has attracted increasing attention in academia and industry. However, it is difficult to obtain suitable public datasets of such objects, which makes relevant studies inconvenient. Thus, we proposed the Reflective Texture-Less (RT-Less) object dataset, which is a new public dataset of reflective texture-less metal parts for pose estimation research. The dataset contains 38 machined texture-less reflective metal parts in total. Different parts demonstrate the symmetry and similarity of shape and size. The dataset contains 289 K RGB images and the same number of masks, including 25,080 real images, 250,800 synthetic images in the training set, and 13,312 real images captured in 32 different scenes in the test set. The dataset also provides accurate ground truth poses, bounding-box annotations and masks for these images, which makes RT-Less suitable for object detection and instance segmentation. To improve the accuracy of the ground truth, an iterative pose optimization method using only RGB images is proposed. Baselines of the state-of-the-art pose estimation methods are provided for further comparative studies. The dataset and results of baselines are available at: http://www.zju-rtl.cn/RT-Less/.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data and necessary code is available at http://www.zju-rtl.cn/RT-Less/.
References
Zabulis, X., Lourakis, M.I., Koutlemanis, P.: Correspondence-free pose estimation for 3D objects from noisy depth data. Vis. Comput. 34, 193–211 (2018)
Liu, D., Chen, L.: SECPNet—secondary encoding network for estimating camera parameters. Vis. Comput. 5, 1–14 (2022)
Li, S., Xian, Y., Wu, W., Zhang, T., Li, B.: Parameter-adaptive multi-frame joint pose optimization method. Vis. Comput. 5, 1–13 (2022)
Liang, D., et al.: Anchor retouching via model interaction for robust object detection in aerial images. IEEE Trans. Geosci. Remote Sens. 60, 1–13 (2021)
Muoz, E., Konishi, Y., Murino, V., Del Bue, A.: Fast 6D pose estimation for texture-less objects from a single RGB image. In: 2016 IEEE International Conference on Robotics and Automation (icra), pp. 5623–5630 (2016)
Crivellaro, A., Rad, M., Verdie, Y., Yi, K.M., Fua, P., Lepetit, V.: A novel representation of parts for accurate 3D object detection and tracking in monocular images. In: 2015 IEEE International Conference on Computer Vision (ICCV), New York: IEEE, pp. 4391–4399. https://doi.org/10.1109/ICCV.2015.499
Wei, Z., et al.: learning calibrated-guidance for object detection in aerial images. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 15, 2721–2733 (2022). https://doi.org/10.1109/JSTARS.2022.3158903
Liang, D., Liu, X.: Coarse-to-fine Foreground Segmentation based on Co-occurrence Pixel-Block and Spatio-Temporal Attention Model. In: 2020 25th International Conference on Pattern Recognition (ICPR), Los Alamitos: IEEE Computer Soc, pp. 3807–3813. https://doi.org/10.1109/ICPR48806.2021.9412814
Sun, H., Chen, X., Wang, L., Liang, D., Liu, N., Zhou, H.: C(2)DAN: an improved deep adaptation network with domain confusion and classifier adaptation. Sensors (2020). https://doi.org/10.3390/s20123606
Rennie, C., Shome, R., Bekris, K.E., De Souza, A.F.: A dataset for improved RGBD-based object detection and pose estimation for warehouse pick-and-place. IEEE Robot. Autom. Lett. (2016). https://doi.org/10.1109/LRA.2016.2532924
Eppner, C., et al.: Lessons from the amazon picking challenge: four aspects of building robotic systems. Robot. Sci. Syst. 5, 96 (2016). https://doi.org/10.15607/RSS.2016.XII.036
Hinterstoisser, S. et al.: Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian conference on computer vision, Springer, pp. 548–562 (2012)
Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. Comput. Vis. 8690, 536–551 (2014). https://doi.org/10.1007/978-3-319-10605-2_35
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes
Calli, B., Srinivasa, S., Singh, A., Abbeel, P., Walsman, A., Dollar, A.M.: The YCB object and model set: towards common benchmarks for manipulation research. In: Proceedings of the 17th International Conference on Advanced Robotics (icar), pp. 510–517 (2015), https://doi.org/10.1109/ICAR.2015.7251504
Richter-Kluge, J., Wellhausen, C., Frese, U.: ESKO6d-a binocular and RGB-D dataset of stored kitchen objects with 6D poses. IEEE/RSJ Int. Conf. Intell. Robot. Syst. 2019, 893–899 (2019)
Kaskman, R., Zakharov, S., Shugurov, I., Ilic, S.: HomebrewedDB: RGB-D dataset for 6D pose estimation of 3D objects. IEEE/Cvf Int. Conf. Comput. Vis. Workshops 2019, 2767–2776 (2019). https://doi.org/10.1109/ICCVW.2019.00338
Tejani, A., Kouskouridas, R., Doumanoglou, A., Tang, D., Kim, T.-K.: Latent-class hough forests for 6 DoF object pose estimation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 1 (2018), https://doi.org/10.1109/TPAMI.2017.2665623
Tremblay, J., To, T., Birchfield, S.: Falling things: a synthetic dataset for 3D object detection and pose estimation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2119–21193 (2018). https://doi.org/10.1109/CVPRW.2018.00275
Hodan, T. et al.: Bop: Benchmark for 6D object pose estimation. In: Proceedings of the European conference on computer vision (ECCV), pp. 19–34 (2018)
Yuan, H., Hoogenkamp, T., Veltkamp, R.C.: RobotP: a benchmark dataset for 6D object pose estimation. Sensors 21(4), 5 (2021). https://doi.org/10.3390/s21041299
Yang, J., Gao, Y., Li, D., Waslander, S.L.: ROBI: a multi-view dataset for reflective objects in robotic bin-picking. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic: IEEE, pp. 9788–9795 (2021). https://doi.org/10.1109/IROS51168.2021.9635871
Yang, J., Li, D., Waslander, S.L.: Probabilistic multi-view fusion of active stereo depth maps for robotic bin-picking. In: IEEE Robotics and Automation Letters, vol. 6, no. 3 (2021). https://doi.org/10.1109/LRA.2021.3068706
Drost, B., Ulrich, M., Bergmann, P., Haertinger, P., Steger, C.: Introducing MVTec ITODD-a dataset for 3D object recognition in industry. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW 2017), pp. 2200–2208 (2017). https://doi.org/10.1109/ICCVW.2017.257
Hodan, T., Haluza, P., Obdrzalek, S., Matas, J., Lourakis, M., Zabulis, X.: T-LESS: an RGB-D dataset for 6D pose estimation of texture-less objects. In: 2017 Ieee Winter Conference on Applications of Computer Vision (wacv 2017), pp. 880–888 (2017). https://doi.org/10.1109/WACV.2017.103
Bregier, R., Devernay, F., Leyrit, L., Crowley, J.L.: Symmetry aware evaluation of 3D object detection and pose estimation in scenes of many parts in bulk. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Venice: IEEE, pp. 2209–2218 (2017). https://doi.org/10.1109/ICCVW.2017.258
Kleeberger, K., Landgraf, C., Huber, M.F.: Large-scale 6D object pose estimation dataset for industrial bin-picking. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2573–2578 (2019). https://doi.org/10.1109/IROS40897.2019.8967594
Doumanoglou, A., Kouskouridas, R., Malassiotis, S., Kim, T.-K.: Recovering 6D object pose and predicting next-best-view in the crowd. IEEE Conf. Comput. Vis. Pattern Recogn. 2016, 3583–3592 (2016). https://doi.org/10.1109/CVPR.2016.390
Li, X., et al.: A sim-to-real object recognition and localization framework for industrial robotic bin picking. IEEE Robot. Autom. Lett. 7(2), 3961–3968 (2022). https://doi.org/10.1109/LRA.2022.3149026
Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1301–1310 (2017)
Feng, Y. et al.: Towards robust part-aware instance segmentation for industrial bin picking. In: 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA: IEEE, pp. 405–411 (2022). https://doi.org/10.1109/ICRA46639.2022.9811728
Zhang, Z.Y.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. (2000). https://doi.org/10.1109/34.888718
Garrido-Jurado, S., Munoz-Salinas, R., Madrid-Cuevas, F.J., Marin-Jimenez, M.J.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognit. (2014). https://doi.org/10.1016/j.patcog.2014.01.005
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In:2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 3485–3492 (2010)
Wu, C., Chen, L., He, Z., Jiang, J.: Pseudo-siamese graph matching network for textureless objects’6-D pose estimation. IEEE Trans. Industr. Electron. 69(3), 2718–2727 (2021)
Haugaard, R.L., Buch, A.G.: SurfEmb: dense and continuous correspondence distributions for object pose estimation with learnt surface embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6749–6758 (2022)
He, Z., Li, Q., Zhao, X., Wang, J., Shen, H., Zhang, S., Tan, J.: ContourPose: monocular 6D pose estimation method for reflective texture-less metal parts. IEEE Trans. Robot. 5, 69 (2023)
He, Z., Chao, Y., Wu, M., Hu, Y., Zhao, X.: G-GOP: generative pose estimation of reflective texture-less metal parts with global-observation-point priors. IEEE-ASME Trans. Mechatron. 5, 96 (2023)
He, Z., Wu, M., Zhao, X., Zhang, S., Tan, J.: A generative feature-to-image robotic vision framework for 6-D pose estimation of metal parts. IEEE-ASME Trans. Mechatron. 27(5), 3198–3209 (2022)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grants 52275514 and 52275547, and in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LY21E050021.
Funding
National Natural Science Foundation of China, 52275514, Zaixing He, 52275547, Xinyue Zhao, Natural Science Foundation of Zhejiang Province, LY21E050021, Xinyue Zhao.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
Xinyue Zhao, Quanzhi Li, Yue Chao, Quanyou Wang, Zaixing He, and Dong Liang declare that they have no conflict of interest or financial conflicts to disclose. The authors declares that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhao, X., Li, Q., Chao, Y. et al. RT-less: a multi-scene RGB dataset for 6D pose estimation of reflective texture-less objects. Vis Comput 40, 5187–5200 (2024). https://doi.org/10.1007/s00371-023-03097-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03097-1