FS-3DSSN: an efficient few-shot learning for single-stage 3D object detection on point clouds

373 Accesses
1 Citation
Explore all metrics

Abstract

The current 3D object detection methods have achieved promising results for conventional tasks to detect frequently occurring objects like cars, pedestrians and cyclists. However, they require many annotated boundary boxes and class labels for training, which is very expensive and hard to obtain. Nevertheless, detecting infrequent occurring objects, such as police vehicles, is also essential for autonomous driving to be successful. Therefore, we explore the potential of few-shot learning to handle this challenge of detecting infrequent categories. The current 3D object detectors do not have the necessary architecture to support this type of learning. Thus, this paper presents a new method termed few-shot single-stage network for 3D object detection (FS-3DSSN) to predict infrequent categories of objects. FS-3DSSN uses a class-incremental few-shot learning approach to detect infrequent categories without compromising the detection accuracy of frequent categories. It consists of two modules: (i) a single-stage network architecture for 3D object detection (3DSSN) using deformable convolutions to detect small objects and (ii) a class-incremental-based meta-learning module to learn and predict infrequent class categories. 3DSSN obtained 84.53 \(\textrm{mAP}_{\textrm{3D}}\) on the KITTI car category and 73.4 NDS on the nuScenes dataset, outperforming previous state of the art. Further, the result of FS-3DSSN on nuScenes is also encouraging for detecting infrequent categories while maintaining accuracy in frequent classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Meta-Det3D: Learn to Learn Few-Shot 3D Object Detection

Improving the Intra-class Long-Tail in 3D Detection via Rare Example Mining

Few-Shot Online Learning for 3D Object Detection in Autonomous Driving

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The dataset used in the script is available at following link: KITTI dataset: https://www.cvlibs.net/datasets/kitti/ nuScenes dataset: https://www.nuscenes.org/nuscenes.

References

Qian, R., Lai, X., Li, X.: 3D object detection for autonomous driving: a survey. Pattern Recogn. 130, 108796 (2022). https://doi.org/10.1016/j.patcog.2022.108796
Article Google Scholar
Drobnitzky, M., Friederich, J., Egger, B., Zschech, P.: Survey and systematization of 3D object detection models and methods. The Visual Computer 11, 1–47 (2023)
Google Scholar
Zheng, W., Tang, W., Jiang, L., Fu, C.-W.: SE-SSD: Self-ensembling single-stage object detector from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14494–14503 (2021)
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1907–1915 (2017)
Wu, P., Gu, L., Yan, X., Xie, H., Wang, F.L., Cheng, G., Wei, M.: PV-RCNN++: semantical point-voxel feature interaction for 3D object detection. Vis. Comput. 39(6), 2425–2440 (2023)
Article Google Scholar
Shi, S., Wang, X., Li, H.: PointRCNN: 3D object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 770–779 (2019)
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)
Wang, K., Zhou, T., Li, X., Ren, F.: Performance and challenges of 3D object detection methods in complex scenes for autonomous driving. IEEE Trans. Intell. Vehicles 8(2), 1699–1716 (2023). https://doi.org/10.1109/TIV.2022.3213796
Article Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: Single shot MultiBox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, Springer, pp. 21–37 (2016)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022)
Yin, T., Zhou, X., Krahenbuhl, P.: Center-based 3D object detection and tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11784–11793 (2021)
Shi, S., Guo, C., Jiang, L., Wang, Z., Shi, J., Wang, X., Li, H.: PV-RCNN: Point-Voxel feature set abstraction for 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10529–10538 (2020)
Cao, J., Tao, C., Zhang, Z., Gao, Z., Luo, X., Zheng, S., Zhu, Y.: Accelerating point-voxel representation of 3D object detection for automatic driving. IEEE Trans. Artif. Intell. (2023). https://doi.org/10.1109/TAI.2023.3237787
Article Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30 (2017)
Chen, Y., Liu, S., Shen, X., Jia, J.: Fast point R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9775–9784 (2019)
Shi, G., Li, R., Ma, C.: PillarNet: Real-time and high-performance pillar-based 3D object detection. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part X, Springer, pp. 35–52 (2022)
Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: PointPillars: Fast encoders for object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)
Yan, Y., Mao, Y., Li, B.: SECOND: sparsely embedded convolutional detection. Sensors 18(10), 3337 (2018)
Article Google Scholar
Liu, J., Dong, X., Zhao, S., Shen, J.: Generalized Few-Shot 3D object detection of LiDAR point cloud for autonomous driving. arXiv preprint arXiv:2302.03914 (2023)
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6569–6578 (2019)
Liu, Z., Xiang, Q., Tang, J., Wang, Y., Zhao, P.: Robust salient object detection for RGB images. Vis. Comput. 36, 1823–1835 (2020)
Article Google Scholar
Alaba, S.Y., Ball, J.E.: Deep learning-based image 3D object detection for autonomous driving. IEEE Sens. J. 23(4), 3378–3394 (2023)
Article Google Scholar
Huang, Z., Chen, B., Zhu, D.: ImGeo-VoteNet: image and geometry co-supported VoteNet for RGB-D object detection. The Visual Computer 10, 1–13 (2023)
Google Scholar
Yang, Z., Sun, Y., Liu, S., Jia, J.: 3DSSD: Point-based 3D single stage object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11040–11048 (2020)
Chen, C., Chen, Z., Zhang, J., Tao, D.: SASA: Semantics-augmented set abstraction for point-based 3D object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 221–229 (2022)
Du, L., Ye, X., Tan, X., Johns, E., Chen, B., Ding, E., Xue, X., Feng, J.: AGO-Net: association-guided 3D point cloud object detection network. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 8097–8109 (2022). https://doi.org/10.1109/TPAMI.2021.3104172
Article Google Scholar
Yu, C., Lei, J., Peng, B., Shen, H., Huang, Q.: SIEV-Net: a structure-information enhanced voxel network for 3D object detection from LiDAR point clouds. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2022)
Google Scholar
Shi, S., Wang, Z., Shi, J., Wang, X., Li, H.: From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2647–2664 (2021). https://doi.org/10.1109/TPAMI.2020.2977026
Article Google Scholar
Huang, G., Laradji, I., Vazquez, D., Lacoste-Julien, S., Rodriguez, P.: A survey of self-supervised and few-shot object detection. IEEE Trans. Pattern Anal Mach. Intell. 45(4), 4071–4089 (2022)
Google Scholar
Cheraghian, A., Rahman, S., Fang, P., Roy, S.K., Petersson, L., Harandi, M.: Semantic-aware knowledge distillation for few-shot class-incremental learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2534–2543 (2021)
Cheng, M., Wang, H., Long, Y.: Meta-learning-based incremental few-shot object detection. IEEE Trans. Circuits Syst. Video Technol. 32(4), 2158–2169 (2022). https://doi.org/10.1109/TCSVT.2021.3088545
Article Google Scholar
Antonelli, S., Avola, D., Cinque, L., Crisostomi, D., Foresti, G.L., Galasso, F., Marini, M.R., Mecca, A., Pannone, D.: Few-shot object detection: a survey. ACM Computing Surveys (CSUR) 54(11s), 1–37 (2022)
Article Google Scholar
Yuan, S., Li, X., Huang, H., Fang, Y.: Meta-Det3D: Learn to learn few-shot 3D object detection. In: Proceedings of the Asian Conference on Computer Vision, pp. 1761–1776 (2022)
Wu, X., Sahoo, D., Hoi, S.: Meta-RCNN: Meta learning for few-shot object detection. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1679–1687 (2020)
Han, G., Huang, S., Ma, J., He, Y., Chang, S.-F.: Meta faster R-CNN: towards accurate few-shot object detection with attentive feature alignment. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 780–789 (2022)
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8419–8428 (2019). https://doi.org/10.1109/ICCV.2019.00851
Jiang, W., Huang, K., Geng, J., Deng, X.: Multi-scale metric learning for few-shot learning. IEEE Trans. Circuits Syst. Video Technol. 31(3), 1091–1102 (2020)
Article Google Scholar
Lu, Y., Chen, X., Wu, Z., Yu, J.: Decoupled metric network for single-stage few-shot object detection. IEEE Trans. Cybern. 53(1), 514–525 (2022)
Article Google Scholar
Wei, L., Cui, W., Hu, Z., Sun, H., Hou, S.: A single-shot multi-level feature reused neural network for object detection. Vis. Comput. 37(1), 133–142 (2021)
Article Google Scholar
Ning, K., Liu, Y., Su, Y., Jiang, K.: Point-voxel and bird-eye-view representation aggregation network for single stage 3D object detection. IEEE Trans. Intell. Trans. Syst. 24(3), 3223–3235 (2022)
Article Google Scholar
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
He, C., Zeng, H., Huang, J., Hua, X.-S., Zhang, L.: Structure aware single-stage 3D object detection from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11873–11882 (2020)
Zhou, D., Fang, J., Song, X., Guan, C., Yin, J., Dai, Y., Yang, R.: IoU Loss for 2D/3D object detection. In: 2019 International Conference on 3D Vision (3DV), pp. 85–94 (2019). IEEE
Chen, Q., Sun, L., Wang, Z., Jia, K., Yuille, A.: Object as hotspots: an anchor-free 3D object detection approach via firing of hotspots. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pp. 68–84 (2020). Springer
Bai, X., Hu, Z., Zhu, X., Huang, Q., Chen, Y., Fu, H., Tai, C.-L.: TransFusion: robust LiDAR-Camera fusion for 3D object detection with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1090–1099 (2022)
Koh, J., Lee, J., Lee, Y., Kim, J., Choi, J.W.: MGTANet: Encoding sequential LiDAR points using long short-term motion-guided temporal attention for 3D object detection. arXiv preprint arXiv:2212.00442 (2022)
Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O.: nuScenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)

Download references

Author information

Authors and Affiliations

Department of Information Technology, ABV-Indian Institute of Information Technology and Management, Gwalior, India
Alok Kumar Tiwari & G. K. Sharma

Authors

Alok Kumar Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
G. K. Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alok Kumar Tiwari.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tiwari, A.K., Sharma, G.K. FS-3DSSN: an efficient few-shot learning for single-stage 3D object detection on point clouds. Vis Comput 40, 8125–8139 (2024). https://doi.org/10.1007/s00371-023-03228-8

Download citation

Accepted: 28 November 2023
Published: 18 January 2024
Issue Date: November 2024
DOI: https://doi.org/10.1007/s00371-023-03228-8

FS-3DSSN: an efficient few-shot learning for single-stage 3D object detection on point clouds

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Meta-Det3D: Learn to Learn Few-Shot 3D Object Detection

Improving the Intra-class Long-Tail in 3D Detection via Rare Example Mining

Few-Shot Online Learning for 3D Object Detection in Autonomous Driving

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

FS-3DSSN: an efficient few-shot learning for single-stage 3D object detection on point clouds

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Meta-Det3D: Learn to Learn Few-Shot 3D Object Detection

Improving the Intra-class Long-Tail in 3D Detection via Rare Example Mining

Few-Shot Online Learning for 3D Object Detection in Autonomous Driving

Explore related subjects

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation