[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-031-73016-0_15guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

LISO: Lidar-Only Self-supervised 3D Object Detection

Published: 26 October 2024 Publication History

Abstract

3D object detection is one of the most important components in any Self-Driving stack, but current state-of-the-art (SOTA) lidar object detectors require costly & slow manual annotation of 3D bounding boxes to perform well. Recently, several methods emerged to generate pseudo ground truth without human supervision, however, all of these methods have various drawbacks: Some methods require sensor rigs with full camera coverage and accurate calibration, partly supplemented by an auxiliary optical flow engine. Others require expensive high-precision localization to find objects that disappeared over multiple drives.
We introduce a novel self-supervised method to train SOTA lidarobject detection networks, requiring only unlabeled sequences of lidar point clouds. We call this trajectory-regularized self-training. It utilizes a SOTA self-supervised lidar scene flownetwork under the hood to generate, track, and iteratively refine pseudo ground truth. We demonstrate the effectiveness of our approach for multiple SOTA object detection networks across multiple real-world datasets. Code will be released (https://github.com/baurst/liso).

References

[1]
Amini, M., Feofanov, V., Pauletto, L., Devijver, E., Maximov, Y.: Self-training: a survey. CoRR abs/2202.12040 (2022). https://arxiv.org/abs/2202.12040
[2]
Bai, X., et al.: Transfusion: robust lidar-camera fusion for 3d object detection with transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 1080–1089. IEEE (2022).
[3]
Baur, S.A., Emmerichs, D.J., Moosmann, F., Pinggera, P., Ommer, B., Geiger, A.: SLIM: self-supervised lidar scene flow and motion segmentation. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021, pp. 13106–13116. IEEE (2021).
[4]
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020, pp. 11618–11628. IEEE (2020).
[5]
Chen, Z., Luo, Y., Wang, Z., Baktashmotlagh, M., Huang, Z.: Revisiting domain-adaptive 3d object detection by reliable, diverse and class-balanced pseudo-labeling. In: IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, 1–6 October 2023, pp. 3691–3703. IEEE (2023).
[6]
Deng, D., Zakhor, A.: RSF: optimizing rigid scene flow from 3D point clouds without labels. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1277–1286, January 2023
[7]
Dewan, A., Caselitz, T., Tipaldi, G.D., Burgard, W.: Motion-based detection and tracking in 3d lidar scans. In: Kragic, D., Bicchi, A., Luca, A.D. (eds.) 2016 IEEE International Conference on Robotics and Automation, ICRA 2016, Stockholm, Sweden, 16–21 May 2016, pp. 4508–4513. IEEE (2016).
[8]
Eskandar, G.: An empirical study of the generalization ability of lidar 3d object detectors to unseen domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23815–23825, June 2024
[9]
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), Portland, Oregon, USA, pp. 226–231. AAAI Press (1996). http://www.aaai.org/Library/KDD/1996/kdd96-037.php
[10]
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robotics Res. 32(11), 1231–1237 (2013).
[11]
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012, pp. 3354–3361. IEEE Computer Society (2012).
[12]
Harley, A.W., et al.: Track, check, repeat: an EM approach to unsupervised tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 16581–16591. Computer Vision Foundation / IEEE (2021). https://openaccess.thecvf.com/content/CVPR2021/html/Harley_Track_Check_Repeat_An_EM_Approach_to_Unsupervised_Tracking_CVPR_2021_paper.html
[13]
Liang, Z., Zhang, Z., Zhang, M., Zhao, X., Pu, S.: RangeIoUDet: range image based real-time 3d object detector optimized by intersection over union. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 7140–7149. Computer Vision Foundation / IEEE (2021). https://openaccess.thecvf.com/content/CVPR2021/html/Liang_RangeIoUDet_Range_Image_Based_Real-Time_3D_Object_Detector_Optimized_by_CVPR_2021_paper.html
[14]
Meyer, G.P., Laddha, A., Kee, E., Vallespi-Gonzalez, C., Wellington, C.K.: LaserNet: an efficient probabilistic 3d object detector for autonomous driving. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 12677–12686. Computer Vision Foundation / IEEE (2019). http://openaccess.thecvf.com/content_CVPR_2019/html/Meyer_LaserNet_An_Efficient_Probabilistic_3D_Object_Detector_for_Autonomous_Driving_CVPR_2019_paper.html
[15]
Najibi M et al. Avidan S, Brostow GJ, Cissé M, Farinella GM, Hassner T, et al. Motion inspired unsupervised perception and prediction in autonomous driving ECCV 2022, Part XXXVIII 2022 Cham Springer 424-443
[16]
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019, pp. 8024–8035 (2019). https://proceedings.neurips.cc/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html
[17]
Pedregosa F et al. Scikit-learn: machine learning in Python J. Mach. Learn. Res. 2011 12 2825-2830
[18]
Rist, C.B., Enzweiler, M., Gavrila, D.M.: Cross-sensor deep domain adaptation for lidar detection and segmentation. In: 2019 IEEE Intelligent Vehicles Symposium, IV 2019, Paris, France, 9–12 June 2019, pp. 1535–1542. IEEE (2019).
[19]
Seidenschwarz, J., Ošep, A., Ferroni, F., Lucey, S., Leal-Taixé, L.: SeMoLi: what moves together belongs together (2024). https://arxiv.org/abs/2402.19463
[20]
Shen, Z., Liang, H., Lin, L., Wang, Z., Huang, W., Yu, J.: Fast ground segmentation for 3d lidar point cloud based on jump-convolution-process. Remote. Sens. 13(16), 3239 (2021).
[21]
Shi, S., Wang, X., Li, H.: PointRCNN: 3D object proposal generation and detection from point cloud. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 770–779. Computer Vision Foundation / IEEE (2019). http://openaccess.thecvf.com/content_CVPR_2019/html/Shi_PointRCNN_3D_Object_Proposal_Generation_and_Detection_From_Point_Cloud_CVPR_2019_paper.html
[22]
Shin, S., Golodetz, S., Vankadari, M., Zhou, K., Markham, A., Trigoni, N.: Sample, crop, track: self-supervised mobile 3D object detection for urban driving lidar. CoRR abs/2209.10471 (2022).
[23]
Song, Z., Yang, B.: OGC: Unsupervised 3D object segmentation from rigid dynamics of point clouds. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=ecNbEOOtqBU
[24]
Sun, P., et al.: Scalability in perception for autonomous driving: Waymo open dataset. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020, pp. 2443–2451. IEEE (2020).
[25]
Sun, P., et al.: RSN: range sparse net for efficient, accurate lidar 3d object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 5725–5734. Computer Vision Foundation / IEEE (2021). https://openaccess.thecvf.com/content/CVPR2021/html/Sun_RSN_Range_Sparse_Net_for_Efficient_Accurate_LiDAR_3D_Object_CVPR_2021_paper.html
[26]
Théodose, R., Denis, D., Chateau, T., Frémont, V., Checchin, P.: A deep learning approach for lidar resolution-agnostic object detection. IEEE Trans. Intell. Transp. Syst. 23(9), 14582–14593 (2022).
[27]
Träuble, B., Pauen, S., Poulin-Dubois, D.: Speed and direction changes induce the perception of animacy in 7-month-old infants. Front. Psychol. 5 (2014). https://www.frontiersin.org/articles/10.3389/fpsyg.2014.01141
[28]
Vizzo, I., Guadagnino, T., Mersch, B., Wiesmann, L., Behley, J., Stachniss, C.: KISS-ICP: in defense of point-to-point ICP - simple, accurate, and robust registration if done the right way. IEEE Robot. Autom. Lett. 8(2), 1029–1036 (2023).
[29]
Wang, Y., et al.: Train in Germany, test in the USA: making 3d object detectors generalize. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020, pp. 11710–11720. Computer Vision Foundation / IEEE (2020). https://openaccess.thecvf.com/content_CVPR_2020/html/Wang_Train_in_Germany_Test_in_the_USA_Making_3D_Object_CVPR_2020_paper.html
[30]
Wang, Y., Chen, Y., Zhang, Z.: 4d unsupervised object discovery. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022 (2022). http://papers.nips.cc/paper_files/paper/2022/hash/e7407ab5e89c405d28ff6807ffec594a-Abstract-Conference.html
[31]
Wilson, B., et al.: Argoverse 2: next generation datasets for self-driving perception and forecasting. In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021) (2021)
[32]
Wozniak, M.K., Hansson, M., Thiel, M., Jensfelt, P.: UADA3D: unsupervised adversarial domain adaptation for 3d object detection with sparse lidar and large domain gaps. CoRR abs/2403.17633 (2024).
[33]
Xu, J., Waslander, S.L.: HyperMODEST: self-supervised 3d object detection with confidence score filtering. CoRR abs/2304.14446 (2023).
[34]
Xu, Q., Zhou, Y., Wang, W., Qi, C.R., Anguelov, D.: SPG: unsupervised domain adaptation for 3d object detection via semantic point generation. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021, pp. 15426–15436. IEEE (2021).
[35]
Yang, Z., Sun, Y., Liu, S., Jia, J.: 3DSSD: point-based 3D single stage object detector. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020, pp. 11037–11045. Computer Vision Foundation / IEEE (2020). https://openaccess.thecvf.com/content_CVPR_2020/html/Yang_3DSSD_Point-Based_3D_Single_Stage_Object_Detector_CVPR_2020_paper.html
[36]
Yin, T., Zhou, X., Krähenbühl, P.: Center-based 3d object detection and tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 11784–11793. Computer Vision Foundation / IEEE (2021). https://openaccess.thecvf.com/content/CVPR2021/html/Yin_Center-Based_3D_Object_Detection_and_Tracking_CVPR_2021_paper.html
[37]
You, Y., et al.: Learning to detect mobile objects from lidar scans without labels. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 1120–1130. IEEE (2022).
[38]
Zhang, L., et al.: Towards unsupervised object detection from lidar point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9317–9328, June 2023
[39]
Zhang, Y., Hu, Q., Xu, G., Ma, Y., Wan, J., Guo, Y.: Not all points are equal: learning highly efficient point-based detectors for 3d lidar point clouds. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 18931–18940. IEEE (2022).
[40]
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3d object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018, pp. 4490–4499. IEEE Computer Society (2018). http://openaccess.thecvf.com/content_cvpr_2018/html/Zhou_VoxelNet_End-to-End_Learning_CVPR_2018_paper.html

Index Terms

  1. LISO: Lidar-Only Self-supervised 3D Object Detection
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image Guide Proceedings
            Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXXXVI
            Sep 2024
            554 pages
            ISBN:978-3-031-73015-3
            DOI:10.1007/978-3-031-73016-0
            • Editors:
            • Aleš Leonardis,
            • Elisa Ricci,
            • Stefan Roth,
            • Olga Russakovsky,
            • Torsten Sattler,
            • Gül Varol

            Publisher

            Springer-Verlag

            Berlin, Heidelberg

            Publication History

            Published: 26 October 2024

            Author Tags

            1. Self-Supervised
            2. LiDAR
            3. Object Detection

            Qualifiers

            • Article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 22 Dec 2024

            Other Metrics

            Citations

            View Options

            View options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media