[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3664647.3681420acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Sparse Query Dense: Enhancing 3D Object Detection with Pseudo Points

Published: 28 October 2024 Publication History

Abstract

Current LiDAR-only 3D detection methods are limited by the sparsity of point clouds. The previous method used pseudo points generated by depth completion to supplement the LiDAR point cloud, but the pseudo points sampling process was complex, and the distribution of pseudo points was uneven. Meanwhile, due to the imprecision of depth completion, the pseudo points suffer from noise and local structural ambiguity, which limit the further improvement of detection accuracy. This paper presents SQDNet, a novel framework designed to address these challenges. SQDNet incorporates two key components: the SQD, which achieves sparse-to-dense matching via grid position indices, allowing for rapid sampling of large-scale pseudo points on the dense depth map directly, thus streamlining the data preprocessing pipeline. And use the density of LiDAR points within these grids to alleviate the uneven distribution and noise problems of pseudo points. Meanwhile, the sparse 3D Backbone is designed to capture long-distance dependencies, thereby improving voxel feature extraction and mitigating local structural blur in pseudo points. The experimental results validate the effectiveness of SQD and achieve considerable detection performance for difficult-to-detect instances on the KITTI test.

References

[1]
Xuyang Bai, Zeyu Hu, Xinge Zhu, Qingqiu Huang, Yilun Chen, Hangbo Fu, and Chiew-Lan Tai. 2022. TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 1080--1089. https://doi.org/10.1109/CVPR52688.2022.00116
[2]
Muhammed Fatih Balin, Abubakar Abid, and James Zou. 2019. Concrete Autoencoders for Differentiable Feature Selection and Reconstruction. In Proceedings of the 36th International Conference on Machine Learning, Vol. 97. PMLR, California, USA, 444--453.
[3]
Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuScenes: A Multimodal Dataset for Autonomous Driving. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, USA, 11618--11628. https://doi.org/10.1109/CVPR42600.2020.01164
[4]
Yixi Cai, Fanze Kong, Yunfan Ren, Fangcheng Zhu, Jiarong Lin, and Fu Zhang. 2024. Occupancy Grid Mapping Without Ray-Casting for High-Resolution LiDAR Sensors. IEEE Transactions on Robotics, Vol. 40 (2024), 172--192. https://doi.org/10.1109/TRO.2023.3323936
[5]
Xin Chao, Zhenjie Hou, Yujian Mo, haiyong Shi, and Wenjing Yao. 2023. Structural feature representation and fusion of human spatial cooperative motion for action recognition. Multimedia Systems, Vol. 29, 3 (2023), 1301--1314. https://doi.org/10.1007/s00530-023-01054--5
[6]
R. Qi Charles, Hao Su, Mo Kaichun, and Leonidas J. Guibas. 2017. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Hawaii, USA, 77--85. https://doi.org/10.1109/CVPR.2017.16
[7]
Chen Chen, Zhe Chen, Jing Zhang, and Dacheng Tao. 2022. SASA: Semantics-Augmented Set Abstraction for Point-Based 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. AAAI, Vancouver, Canada, 221--229. https://doi.org/10.1609/aaai.v36i1.19897
[8]
Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, and Jiaya Jia. 2022. Focal Sparse Convolutional Networks for 3D Object Detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 5418--5427. https://doi.org/10.1109/CVPR52688.2022.00535
[9]
Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, and Jiaya Jia. 2023. LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 13488--13498. https://doi.org/10.1109/CVPR52729.2023.01296
[10]
Zhe Chen, Jing Zhang, and Dacheng Tao. 2019. Progressive LiDAR adaptation for road detection. IEEE/CAA Journal of Automatica Sinica, Vol. 6, 3 (2019), 693--702. https://doi.org/10.1109/JAS.2019.1911459
[11]
Jeffrey Delmerico, Titus Cieslewski, Henri Rebecq, Matthias Faessler, and Davide Scaramuzza. 2019. Are We Ready for Autonomous Drone Racing? The UZH-FPV Drone Racing Dataset. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, Montreal, Canada, 6713--6719. https://doi.org/10.1109/ICRA.2019.8793887
[12]
Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, and Houqiang Li. 2021. Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. AAAI, 1201--1209. https://doi.org/10.1609/aaai.v35i2.16207
[13]
Groh Fabian, Wieschollek Patrick, and P. A. Lensch Hendrik. 2018. Flex-Convolution - Million-Scale Point-Cloud Learning Beyond Grid- Worlds. In Asian Conference on Computer Vision. Springer, Cham, Perth, Australia, 105--122. https://doi.org/10.1007/978--3-030--20887--5_7
[14]
Benjamin Graham, Martin Engelcke, and Laurens van der Maaten. 2018. 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, Utah, USA, 9224--9232. https://doi.org/10.1109/CVPR.2018.00961
[15]
Hiep Anh Hoang, Duy Cuong Bui, and Myungsik Yoo. 2024. TSSTDet: Transformation-Based 3-D Object Detection via a Spatial Shape Transformer. IEEE Sensors Journal, Vol. 24, 5 (2024), 7126--7139. https://doi.org/10.1109/JSEN.2024.3350770
[16]
Hiep Anh Hoang and Myungsik Yoo. 2023. 3ONet: 3-D Detector for Occluded Object Under Obstructed Conditions. IEEE Sensors Journal, Vol. 23, 16 (2023), 18879--18892. https://doi.org/10.1109/JSEN.2023.3293515
[17]
Mu Hu, Shuling Wang, Bin Li, Shiyu Ning, Li Fan, and Xiaojin Gong. 2021. PENet: Towards Precise and Efficient Image Guided Depth Completion. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, Xi'an, China, 13656--13662. https://doi.org/10.1109/ICRA48506.2021.9561035
[18]
Saif Imran, Xiaoming Liu, and Daniel Morris. 2021. Depth Completion With Twin Surface Extrapolation at Occlusion Boundaries. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2583--2592.
[19]
Jordan S. K. Hu, Tianshu Kuai, and Steven L Waslander. 2022. Point Density-Aware Voxels for LiDAR 3D Object Detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 8459--8468. https://doi.org/10.1109/CVPR52688.2022.00828
[20]
Jinyu Li, Chenxu Luo, and Xiaodong Yang. 2023. PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 17567--17576. https://doi.org/10.1109/CVPR52729.2023.01685
[21]
Xin Li, Tao Ma, Yuenan Hou, Botian Shi, Yuchen Yang, Youquan Liu, Xingjiao Wu, Qin Chen, Yikang Li, Yu Qiao, and Liang He. 2023. LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 13488--13498. https://doi.org/10.1109/CVPR52729.2023.01681
[22]
Jianhui Liu, Yukang Chen, Xiaoqing Ye, Zhuotao Tian, Xiao Tan, and Xiaojuan Qi. 2022. Spatial Pruned Sparse Convolution for Efficient 3D Object Detection. In Advances in Neural Information Processing Systems, Vol. 35. Curran Associates, Inc., New Orleans, USA, 6735--6748.
[23]
Zongdai Liu, Dingfu Zhou, Feixiang Lu, Jin Fang, and Liangjun Zhang. 2021. AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 15621--15630. https://doi.org/10.1109/iccv48922.2021.01535
[24]
Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Junjie Yan, and Wanli Ouyang. 2021. Geometry Uncertainty Projection Network for Monocular 3D Object Detection. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 3091--3101. https://doi.org/10.1109/iccv48922.2021.00310
[25]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and LeonidasJ Guibas. 2017. PointNet: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc., Long Beach, California, USA.
[26]
David Schinagl, Georg Krispel, Horst Possegger, Peter M. Roth, and Horst Bischof. 2022. OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 1131--1140. https://doi.org/10.1109/CVPR52688.2022.00121
[27]
Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, and Hongsheng Li. 2020. PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, USA, 10526--10535. https://doi.org/10.1109/cvpr42600.2020.01054
[28]
Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, and Hongsheng Li. 2023. PV-RCNN: Point-voxel feature set abstraction with local vector representation for 3D object detection. International Journal of Computer Vision, Vol. 131, 2 (2023), 531--551.
[29]
OpenPCDet Development Team. 2020. OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Clouds. https://github.com/open-mmlab/OpenPCDet.
[30]
Tai Wang, Xinge ZHU, Jiangmiao Pang, and Dahua Lin. 2021. Probabilistic and Geometric Depth: Detecting Objects in Perspective. In Proceedings of the 5th Conference on Robot Learning, Vol. 164. PMLR, 1475--1485.
[31]
Xinshuo Weng and Kris Kitani. 2019. Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, Seoul, South Korea, 857--866. https://doi.org/10.1109/ICCVW.2019.00114
[32]
Alex Wong and Stefano Soatto. 2021. Unsupervised Depth Completion With Calibrated Backprojection Layers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 12747--12756.
[33]
Hai Wu, Jinhao Deng, Chenglu Wen, Xin Li, Cheng Wang, and Jonathan Li. 2022. CasA: A Cascade Attention Network for 3-D Object Detection From LiDAR Point Clouds. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2022), 1--11. https://doi.org/10.1109/TGRS.2022.3203163
[34]
Hai Wu, Chenglu Wen, Wei Li, Xin Li, Ruigang Yang, and cheng Wang. 2023. Transformation-Equivariant 3D Object Detection for Autonomous Driving. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37(3). AAAI, Washington D.C., USA, 2795--2802. https://doi.org/10.1609/aaai.v37i3.25380
[35]
Hai Wu, Chenglu Wen, Shaoshuai Shi, Xin Li, and Cheng Wang. 2023. Virtual Sparse Convolution for Multimodal 3D Object Detection. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 21653--21662. https://doi.org/10.1109/CVPR52729.2023.02074
[36]
Xiaopei Wu, Liang Peng, Honghui Yang, Liang Xie, Chenxi Huang, Chengqi Deng, Haifeng Liu, and Deng Cai. 2022. Sparse Fuse Dense: Towards High Quality 3D Detection With Depth Completion. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 5418--5427.
[37]
Qiming Xia, Yidong Chen, Guorong Cai, Guikun Chen, Daoshun Xie, Jinhe Su, and Zongyue Wang. 2023. 3-D HANet: A Flexible 3-D Heatmap Auxiliary Network for Object Detection. IEEE Transactions on Geoscience and Remote Sensing, Vol. 61 (2023), 1--13. https://doi.org/10.1109/TGRS.2023.3250229
[38]
Honghui Yang, Zili Liu, Xiaopei Wu, Wenxiao Wang, Wei Qian, Xiaofei He, and Deng Cai. 2022. Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. In 2022 European Conference on Computer Vision(ECCV). Springer, Cham, Tel Aviv, Israel, 662--679. https://doi.org/10.1007/978--3-031--20074--8_38
[39]
Honghui Yang, Wenxiao Wang, Minghao Chen, Binbin Lin, Tong He, Hua Chen, Xiaofei He, and Wanli Ouyang. 2023. PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 13476--13487. https://doi.org/10.1109/CVPR52729.2023.01295
[40]
Jiancheng Yang, Qiang Zhang, Bingbing Ni, Linguo Li, Jinxian Liu, Mengdie Zhou, and Qi Tian. 2019. Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Long Beach, CA, USA, 3318--3327. https://doi.org/10.1109/CVPR.2019.00344
[41]
Zetong Yang, Yanan Sun, Shu Liu, and Jiaya Jia. 2020. 3DSSD: Point-Based 3D Single Stage Object Detector. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, USA, 11037--11045. https://doi.org/10.1109/CVPR42600.2020.01105
[42]
Tianwei Yin, Xingyi Zhou, and Philipp Krähenbühl. 2021. Multimodal Virtual Point 3D Detection. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 16494--16507.
[43]
Tianwei Yin, Xingyi Zhou, and Philipp Krähenbühl. 2021. Center-based 3D Object Detection and Tracking. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 11779--11788. https://doi.org/10.1109/CVPR46437.2021.01161
[44]
Gang Zhang, Junnan Chen, Guohuan Gao, Jianmin Li Li, Si Liu, and Xiaolin Hu. 2024. SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, USA, 14477--14486.
[45]
Yifan Zhang, Qingyong Hu, Guoquan Xu, Yanxin Ma, Jianwei Wan, and Yulan Guo. 2022. Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 18931--18940. https://doi.org/10.1109/CVPR52688.2022.01838
[46]
Yifan Zhang, Qijian Zhang, Zhiyu Zhu, Junhui Hou, and Yuan Yixuan. 2023. GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation. International Journal of Computer Vision, Vol. 131, 12 (2023), 3332--3352. https://doi.org/10.1007/s11263-023-01869--9
[47]
Tianchen Zhao, Xuefei Ning, Ke Hong, Zhongyuan Qiu, Pu Lu, Yali Zhao, Linfeng Zhang, Lipu Zhou, Guohao Dai, Huazhong Yang, and Yu Wang. 2023. Ada3D: Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Paris, France, 17728--17738.
[48]
Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. 2018. Open3D: A Modern Library for 3D Data Processing. arXiv:1801.09847 (2018).

Index Terms

  1. Sparse Query Dense: Enhancing 3D Object Detection with Pseudo Points

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. local structural blur
    2. point cloud sparsity
    3. sparse 3d backbone
    4. sparse query dense
    5. sparse-to-dense matching

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 103
      Total Downloads
    • Downloads (Last 12 months)103
    • Downloads (Last 6 weeks)82
    Reflects downloads up to 13 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media