[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

RT-less: a multi-scene RGB dataset for 6D pose estimation of reflective texture-less objects

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

The 6D (6 Degree of freedom) pose estimation (or pose measurement) of machined reflective texture-less objects, which are common in industry, is a significant but challenging technique. It has attracted increasing attention in academia and industry. However, it is difficult to obtain suitable public datasets of such objects, which makes relevant studies inconvenient. Thus, we proposed the Reflective Texture-Less (RT-Less) object dataset, which is a new public dataset of reflective texture-less metal parts for pose estimation research. The dataset contains 38 machined texture-less reflective metal parts in total. Different parts demonstrate the symmetry and similarity of shape and size. The dataset contains 289 K RGB images and the same number of masks, including 25,080 real images, 250,800 synthetic images in the training set, and 13,312 real images captured in 32 different scenes in the test set. The dataset also provides accurate ground truth poses, bounding-box annotations and masks for these images, which makes RT-Less suitable for object detection and instance segmentation. To improve the accuracy of the ground truth, an iterative pose optimization method using only RGB images is proposed. Baselines of the state-of-the-art pose estimation methods are provided for further comparative studies. The dataset and results of baselines are available at: http://www.zju-rtl.cn/RT-Less/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The data and necessary code is available at http://www.zju-rtl.cn/RT-Less/.

References

  1. Zabulis, X., Lourakis, M.I., Koutlemanis, P.: Correspondence-free pose estimation for 3D objects from noisy depth data. Vis. Comput. 34, 193–211 (2018)

    Article  Google Scholar 

  2. Liu, D., Chen, L.: SECPNet—secondary encoding network for estimating camera parameters. Vis. Comput. 5, 1–14 (2022)

    Google Scholar 

  3. Li, S., Xian, Y., Wu, W., Zhang, T., Li, B.: Parameter-adaptive multi-frame joint pose optimization method. Vis. Comput. 5, 1–13 (2022)

    Google Scholar 

  4. Liang, D., et al.: Anchor retouching via model interaction for robust object detection in aerial images. IEEE Trans. Geosci. Remote Sens. 60, 1–13 (2021)

    Google Scholar 

  5. Muoz, E., Konishi, Y., Murino, V., Del Bue, A.: Fast 6D pose estimation for texture-less objects from a single RGB image. In: 2016 IEEE International Conference on Robotics and Automation (icra), pp. 5623–5630 (2016)

  6. Crivellaro, A., Rad, M., Verdie, Y., Yi, K.M., Fua, P., Lepetit, V.: A novel representation of parts for accurate 3D object detection and tracking in monocular images. In: 2015 IEEE International Conference on Computer Vision (ICCV), New York: IEEE, pp. 4391–4399. https://doi.org/10.1109/ICCV.2015.499

  7. Wei, Z., et al.: learning calibrated-guidance for object detection in aerial images. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 15, 2721–2733 (2022). https://doi.org/10.1109/JSTARS.2022.3158903

    Article  Google Scholar 

  8. Liang, D., Liu, X.: Coarse-to-fine Foreground Segmentation based on Co-occurrence Pixel-Block and Spatio-Temporal Attention Model. In: 2020 25th International Conference on Pattern Recognition (ICPR), Los Alamitos: IEEE Computer Soc, pp. 3807–3813. https://doi.org/10.1109/ICPR48806.2021.9412814

  9. Sun, H., Chen, X., Wang, L., Liang, D., Liu, N., Zhou, H.: C(2)DAN: an improved deep adaptation network with domain confusion and classifier adaptation. Sensors (2020). https://doi.org/10.3390/s20123606

    Article  Google Scholar 

  10. Rennie, C., Shome, R., Bekris, K.E., De Souza, A.F.: A dataset for improved RGBD-based object detection and pose estimation for warehouse pick-and-place. IEEE Robot. Autom. Lett. (2016). https://doi.org/10.1109/LRA.2016.2532924

    Article  Google Scholar 

  11. Eppner, C., et al.: Lessons from the amazon picking challenge: four aspects of building robotic systems. Robot. Sci. Syst. 5, 96 (2016). https://doi.org/10.15607/RSS.2016.XII.036

    Article  Google Scholar 

  12. Hinterstoisser, S. et al.: Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian conference on computer vision, Springer, pp. 548–562 (2012)

  13. Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. Comput. Vis. 8690, 536–551 (2014). https://doi.org/10.1007/978-3-319-10605-2_35

    Article  Google Scholar 

  14. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes

  15. Calli, B., Srinivasa, S., Singh, A., Abbeel, P., Walsman, A., Dollar, A.M.: The YCB object and model set: towards common benchmarks for manipulation research. In: Proceedings of the 17th International Conference on Advanced Robotics (icar), pp. 510–517 (2015), https://doi.org/10.1109/ICAR.2015.7251504

  16. Richter-Kluge, J., Wellhausen, C., Frese, U.: ESKO6d-a binocular and RGB-D dataset of stored kitchen objects with 6D poses. IEEE/RSJ Int. Conf. Intell. Robot. Syst. 2019, 893–899 (2019)

    Google Scholar 

  17. Kaskman, R., Zakharov, S., Shugurov, I., Ilic, S.: HomebrewedDB: RGB-D dataset for 6D pose estimation of 3D objects. IEEE/Cvf Int. Conf. Comput. Vis. Workshops 2019, 2767–2776 (2019). https://doi.org/10.1109/ICCVW.2019.00338

    Article  Google Scholar 

  18. Tejani, A., Kouskouridas, R., Doumanoglou, A., Tang, D., Kim, T.-K.: Latent-class hough forests for 6 DoF object pose estimation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 1 (2018), https://doi.org/10.1109/TPAMI.2017.2665623

  19. Tremblay, J., To, T., Birchfield, S.: Falling things: a synthetic dataset for 3D object detection and pose estimation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2119–21193 (2018). https://doi.org/10.1109/CVPRW.2018.00275

  20. Hodan, T. et al.: Bop: Benchmark for 6D object pose estimation. In: Proceedings of the European conference on computer vision (ECCV), pp. 19–34 (2018)

  21. Yuan, H., Hoogenkamp, T., Veltkamp, R.C.: RobotP: a benchmark dataset for 6D object pose estimation. Sensors 21(4), 5 (2021). https://doi.org/10.3390/s21041299

    Article  Google Scholar 

  22. Yang, J., Gao, Y., Li, D., Waslander, S.L.: ROBI: a multi-view dataset for reflective objects in robotic bin-picking. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic: IEEE, pp. 9788–9795 (2021). https://doi.org/10.1109/IROS51168.2021.9635871

  23. Yang, J., Li, D., Waslander, S.L.: Probabilistic multi-view fusion of active stereo depth maps for robotic bin-picking. In: IEEE Robotics and Automation Letters, vol. 6, no. 3 (2021). https://doi.org/10.1109/LRA.2021.3068706

  24. Drost, B., Ulrich, M., Bergmann, P., Haertinger, P., Steger, C.: Introducing MVTec ITODD-a dataset for 3D object recognition in industry. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW 2017), pp. 2200–2208 (2017). https://doi.org/10.1109/ICCVW.2017.257

  25. Hodan, T., Haluza, P., Obdrzalek, S., Matas, J., Lourakis, M., Zabulis, X.: T-LESS: an RGB-D dataset for 6D pose estimation of texture-less objects. In: 2017 Ieee Winter Conference on Applications of Computer Vision (wacv 2017), pp. 880–888 (2017). https://doi.org/10.1109/WACV.2017.103

  26. Bregier, R., Devernay, F., Leyrit, L., Crowley, J.L.: Symmetry aware evaluation of 3D object detection and pose estimation in scenes of many parts in bulk. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Venice: IEEE, pp. 2209–2218 (2017). https://doi.org/10.1109/ICCVW.2017.258

  27. Kleeberger, K., Landgraf, C., Huber, M.F.: Large-scale 6D object pose estimation dataset for industrial bin-picking. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2573–2578 (2019). https://doi.org/10.1109/IROS40897.2019.8967594

  28. Doumanoglou, A., Kouskouridas, R., Malassiotis, S., Kim, T.-K.: Recovering 6D object pose and predicting next-best-view in the crowd. IEEE Conf. Comput. Vis. Pattern Recogn. 2016, 3583–3592 (2016). https://doi.org/10.1109/CVPR.2016.390

    Article  Google Scholar 

  29. Li, X., et al.: A sim-to-real object recognition and localization framework for industrial robotic bin picking. IEEE Robot. Autom. Lett. 7(2), 3961–3968 (2022). https://doi.org/10.1109/LRA.2022.3149026

    Article  Google Scholar 

  30. Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1301–1310 (2017)

  31. Feng, Y. et al.: Towards robust part-aware instance segmentation for industrial bin picking. In: 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA: IEEE, pp. 405–411 (2022). https://doi.org/10.1109/ICRA46639.2022.9811728

  32. Zhang, Z.Y.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. (2000). https://doi.org/10.1109/34.888718

    Article  Google Scholar 

  33. Garrido-Jurado, S., Munoz-Salinas, R., Madrid-Cuevas, F.J., Marin-Jimenez, M.J.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognit. (2014). https://doi.org/10.1016/j.patcog.2014.01.005

    Article  Google Scholar 

  34. Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In:2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 3485–3492 (2010)

  35. Wu, C., Chen, L., He, Z., Jiang, J.: Pseudo-siamese graph matching network for textureless objects’6-D pose estimation. IEEE Trans. Industr. Electron. 69(3), 2718–2727 (2021)

    Article  Google Scholar 

  36. Haugaard, R.L., Buch, A.G.: SurfEmb: dense and continuous correspondence distributions for object pose estimation with learnt surface embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6749–6758 (2022)

  37. He, Z., Li, Q., Zhao, X., Wang, J., Shen, H., Zhang, S., Tan, J.: ContourPose: monocular 6D pose estimation method for reflective texture-less metal parts. IEEE Trans. Robot. 5, 69 (2023)

    Google Scholar 

  38. He, Z., Chao, Y., Wu, M., Hu, Y., Zhao, X.: G-GOP: generative pose estimation of reflective texture-less metal parts with global-observation-point priors. IEEE-ASME Trans. Mechatron. 5, 96 (2023)

    Google Scholar 

  39. He, Z., Wu, M., Zhao, X., Zhang, S., Tan, J.: A generative feature-to-image robotic vision framework for 6-D pose estimation of metal parts. IEEE-ASME Trans. Mechatron. 27(5), 3198–3209 (2022)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants 52275514 and 52275547, and in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LY21E050021.

Funding

National Natural Science Foundation of China, 52275514, Zaixing He, 52275547, Xinyue Zhao, Natural Science Foundation of Zhejiang Province, LY21E050021, Xinyue Zhao.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zaixing He.

Ethics declarations

Conflicts of interest

Xinyue Zhao, Quanzhi Li, Yue Chao, Quanyou Wang, Zaixing He, and Dong Liang declare that they have no conflict of interest or financial conflicts to disclose. The authors declares that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, X., Li, Q., Chao, Y. et al. RT-less: a multi-scene RGB dataset for 6D pose estimation of reflective texture-less objects. Vis Comput 40, 5187–5200 (2024). https://doi.org/10.1007/s00371-023-03097-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-03097-1

Keywords

Navigation