[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

FS-3DSSN: an efficient few-shot learning for single-stage 3D object detection on point clouds

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

The current 3D object detection methods have achieved promising results for conventional tasks to detect frequently occurring objects like cars, pedestrians and cyclists. However, they require many annotated boundary boxes and class labels for training, which is very expensive and hard to obtain. Nevertheless, detecting infrequent occurring objects, such as police vehicles, is also essential for autonomous driving to be successful. Therefore, we explore the potential of few-shot learning to handle this challenge of detecting infrequent categories. The current 3D object detectors do not have the necessary architecture to support this type of learning. Thus, this paper presents a new method termed few-shot single-stage network for 3D object detection (FS-3DSSN) to predict infrequent categories of objects. FS-3DSSN uses a class-incremental few-shot learning approach to detect infrequent categories without compromising the detection accuracy of frequent categories. It consists of two modules: (i) a single-stage network architecture for 3D object detection (3DSSN) using deformable convolutions to detect small objects and (ii) a class-incremental-based meta-learning module to learn and predict infrequent class categories. 3DSSN obtained 84.53 \(\textrm{mAP}_{\textrm{3D}}\) on the KITTI car category and 73.4 NDS on the nuScenes dataset, outperforming previous state of the art. Further, the result of FS-3DSSN on nuScenes is also encouraging for detecting infrequent categories while maintaining accuracy in frequent classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Algorithm 2
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The dataset used in the script is available at following link: KITTI dataset: https://www.cvlibs.net/datasets/kitti/ nuScenes dataset: https://www.nuscenes.org/nuscenes.

References

  1. Qian, R., Lai, X., Li, X.: 3D object detection for autonomous driving: a survey. Pattern Recogn. 130, 108796 (2022). https://doi.org/10.1016/j.patcog.2022.108796

    Article  Google Scholar 

  2. Drobnitzky, M., Friederich, J., Egger, B., Zschech, P.: Survey and systematization of 3D object detection models and methods. The Visual Computer 11, 1–47 (2023)

    Google Scholar 

  3. Zheng, W., Tang, W., Jiang, L., Fu, C.-W.: SE-SSD: Self-ensembling single-stage object detector from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14494–14503 (2021)

  4. Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1907–1915 (2017)

  5. Wu, P., Gu, L., Yan, X., Xie, H., Wang, F.L., Cheng, G., Wei, M.: PV-RCNN++: semantical point-voxel feature interaction for 3D object detection. Vis. Comput. 39(6), 2425–2440 (2023)

    Article  Google Scholar 

  6. Shi, S., Wang, X., Li, H.: PointRCNN: 3D object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 770–779 (2019)

  7. Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)

  8. Wang, K., Zhou, T., Li, X., Ren, F.: Performance and challenges of 3D object detection methods in complex scenes for autonomous driving. IEEE Trans. Intell. Vehicles 8(2), 1699–1716 (2023). https://doi.org/10.1109/TIV.2022.3213796

    Article  Google Scholar 

  9. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: Single shot MultiBox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, Springer, pp. 21–37 (2016)

  10. Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022)

  11. Yin, T., Zhou, X., Krahenbuhl, P.: Center-based 3D object detection and tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11784–11793 (2021)

  12. Shi, S., Guo, C., Jiang, L., Wang, Z., Shi, J., Wang, X., Li, H.: PV-RCNN: Point-Voxel feature set abstraction for 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10529–10538 (2020)

  13. Cao, J., Tao, C., Zhang, Z., Gao, Z., Luo, X., Zheng, S., Zhu, Y.: Accelerating point-voxel representation of 3D object detection for automatic driving. IEEE Trans. Artif. Intell. (2023). https://doi.org/10.1109/TAI.2023.3237787

    Article  Google Scholar 

  14. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

  15. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30 (2017)

  16. Chen, Y., Liu, S., Shen, X., Jia, J.: Fast point R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9775–9784 (2019)

  17. Shi, G., Li, R., Ma, C.: PillarNet: Real-time and high-performance pillar-based 3D object detection. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part X, Springer, pp. 35–52 (2022)

  18. Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: PointPillars: Fast encoders for object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)

  19. Yan, Y., Mao, Y., Li, B.: SECOND: sparsely embedded convolutional detection. Sensors 18(10), 3337 (2018)

    Article  Google Scholar 

  20. Liu, J., Dong, X., Zhao, S., Shen, J.: Generalized Few-Shot 3D object detection of LiDAR point cloud for autonomous driving. arXiv preprint arXiv:2302.03914 (2023)

  21. Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6569–6578 (2019)

  22. Liu, Z., Xiang, Q., Tang, J., Wang, Y., Zhao, P.: Robust salient object detection for RGB images. Vis. Comput. 36, 1823–1835 (2020)

    Article  Google Scholar 

  23. Alaba, S.Y., Ball, J.E.: Deep learning-based image 3D object detection for autonomous driving. IEEE Sens. J. 23(4), 3378–3394 (2023)

    Article  Google Scholar 

  24. Huang, Z., Chen, B., Zhu, D.: ImGeo-VoteNet: image and geometry co-supported VoteNet for RGB-D object detection. The Visual Computer 10, 1–13 (2023)

    Google Scholar 

  25. Yang, Z., Sun, Y., Liu, S., Jia, J.: 3DSSD: Point-based 3D single stage object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11040–11048 (2020)

  26. Chen, C., Chen, Z., Zhang, J., Tao, D.: SASA: Semantics-augmented set abstraction for point-based 3D object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 221–229 (2022)

  27. Du, L., Ye, X., Tan, X., Johns, E., Chen, B., Ding, E., Xue, X., Feng, J.: AGO-Net: association-guided 3D point cloud object detection network. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 8097–8109 (2022). https://doi.org/10.1109/TPAMI.2021.3104172

    Article  Google Scholar 

  28. Yu, C., Lei, J., Peng, B., Shen, H., Huang, Q.: SIEV-Net: a structure-information enhanced voxel network for 3D object detection from LiDAR point clouds. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2022)

    Google Scholar 

  29. Shi, S., Wang, Z., Shi, J., Wang, X., Li, H.: From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2647–2664 (2021). https://doi.org/10.1109/TPAMI.2020.2977026

    Article  Google Scholar 

  30. Huang, G., Laradji, I., Vazquez, D., Lacoste-Julien, S., Rodriguez, P.: A survey of self-supervised and few-shot object detection. IEEE Trans. Pattern Anal Mach. Intell. 45(4), 4071–4089 (2022)

    Google Scholar 

  31. Cheraghian, A., Rahman, S., Fang, P., Roy, S.K., Petersson, L., Harandi, M.: Semantic-aware knowledge distillation for few-shot class-incremental learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2534–2543 (2021)

  32. Cheng, M., Wang, H., Long, Y.: Meta-learning-based incremental few-shot object detection. IEEE Trans. Circuits Syst. Video Technol. 32(4), 2158–2169 (2022). https://doi.org/10.1109/TCSVT.2021.3088545

    Article  Google Scholar 

  33. Antonelli, S., Avola, D., Cinque, L., Crisostomi, D., Foresti, G.L., Galasso, F., Marini, M.R., Mecca, A., Pannone, D.: Few-shot object detection: a survey. ACM Computing Surveys (CSUR) 54(11s), 1–37 (2022)

    Article  Google Scholar 

  34. Yuan, S., Li, X., Huang, H., Fang, Y.: Meta-Det3D: Learn to learn few-shot 3D object detection. In: Proceedings of the Asian Conference on Computer Vision, pp. 1761–1776 (2022)

  35. Wu, X., Sahoo, D., Hoi, S.: Meta-RCNN: Meta learning for few-shot object detection. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1679–1687 (2020)

  36. Han, G., Huang, S., Ma, J., He, Y., Chang, S.-F.: Meta faster R-CNN: towards accurate few-shot object detection with attentive feature alignment. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 780–789 (2022)

  37. Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8419–8428 (2019). https://doi.org/10.1109/ICCV.2019.00851

  38. Jiang, W., Huang, K., Geng, J., Deng, X.: Multi-scale metric learning for few-shot learning. IEEE Trans. Circuits Syst. Video Technol. 31(3), 1091–1102 (2020)

    Article  Google Scholar 

  39. Lu, Y., Chen, X., Wu, Z., Yu, J.: Decoupled metric network for single-stage few-shot object detection. IEEE Trans. Cybern. 53(1), 514–525 (2022)

    Article  Google Scholar 

  40. Wei, L., Cui, W., Hu, Z., Sun, H., Hou, S.: A single-shot multi-level feature reused neural network for object detection. Vis. Comput. 37(1), 133–142 (2021)

    Article  Google Scholar 

  41. Ning, K., Liu, Y., Su, Y., Jiang, K.: Point-voxel and bird-eye-view representation aggregation network for single stage 3D object detection. IEEE Trans. Intell. Trans. Syst. 24(3), 3223–3235 (2022)

    Article  Google Scholar 

  42. Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

  43. Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

  44. He, C., Zeng, H., Huang, J., Hua, X.-S., Zhang, L.: Structure aware single-stage 3D object detection from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11873–11882 (2020)

  45. Zhou, D., Fang, J., Song, X., Guan, C., Yin, J., Dai, Y., Yang, R.: IoU Loss for 2D/3D object detection. In: 2019 International Conference on 3D Vision (3DV), pp. 85–94 (2019). IEEE

  46. Chen, Q., Sun, L., Wang, Z., Jia, K., Yuille, A.: Object as hotspots: an anchor-free 3D object detection approach via firing of hotspots. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pp. 68–84 (2020). Springer

  47. Bai, X., Hu, Z., Zhu, X., Huang, Q., Chen, Y., Fu, H., Tai, C.-L.: TransFusion: robust LiDAR-Camera fusion for 3D object detection with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1090–1099 (2022)

  48. Koh, J., Lee, J., Lee, Y., Kim, J., Choi, J.W.: MGTANet: Encoding sequential LiDAR points using long short-term motion-guided temporal attention for 3D object detection. arXiv preprint arXiv:2212.00442 (2022)

  49. Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O.: nuScenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alok Kumar Tiwari.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tiwari, A.K., Sharma, G.K. FS-3DSSN: an efficient few-shot learning for single-stage 3D object detection on point clouds. Vis Comput 40, 8125–8139 (2024). https://doi.org/10.1007/s00371-023-03228-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-03228-8

Keywords

Navigation