More Web Proxy on the site http://driver.im/

research-article

Sparse Query Dense: Enhancing 3D Object Detection with Pseudo Points

Authors:

Jun YanAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 409 - 418

https://doi.org/10.1145/3664647.3681420

Published: 28 October 2024 Publication History

Abstract

Current LiDAR-only 3D detection methods are limited by the sparsity of point clouds. The previous method used pseudo points generated by depth completion to supplement the LiDAR point cloud, but the pseudo points sampling process was complex, and the distribution of pseudo points was uneven. Meanwhile, due to the imprecision of depth completion, the pseudo points suffer from noise and local structural ambiguity, which limit the further improvement of detection accuracy. This paper presents SQDNet, a novel framework designed to address these challenges. SQDNet incorporates two key components: the SQD, which achieves sparse-to-dense matching via grid position indices, allowing for rapid sampling of large-scale pseudo points on the dense depth map directly, thus streamlining the data preprocessing pipeline. And use the density of LiDAR points within these grids to alleviate the uneven distribution and noise problems of pseudo points. Meanwhile, the sparse 3D Backbone is designed to capture long-distance dependencies, thereby improving voxel feature extraction and mitigating local structural blur in pseudo points. The experimental results validate the effectiveness of SQD and achieve considerable detection performance for difficult-to-detect instances on the KITTI test.

References

[1]

Xuyang Bai, Zeyu Hu, Xinge Zhu, Qingqiu Huang, Yilun Chen, Hangbo Fu, and Chiew-Lan Tai. 2022. TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 1080--1089. https://doi.org/10.1109/CVPR52688.2022.00116

[2]

Muhammed Fatih Balin, Abubakar Abid, and James Zou. 2019. Concrete Autoencoders for Differentiable Feature Selection and Reconstruction. In Proceedings of the 36th International Conference on Machine Learning, Vol. 97. PMLR, California, USA, 444--453.

[3]

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuScenes: A Multimodal Dataset for Autonomous Driving. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, USA, 11618--11628. https://doi.org/10.1109/CVPR42600.2020.01164

[4]

Yixi Cai, Fanze Kong, Yunfan Ren, Fangcheng Zhu, Jiarong Lin, and Fu Zhang. 2024. Occupancy Grid Mapping Without Ray-Casting for High-Resolution LiDAR Sensors. IEEE Transactions on Robotics, Vol. 40 (2024), 172--192. https://doi.org/10.1109/TRO.2023.3323936

Digital Library

[5]

Xin Chao, Zhenjie Hou, Yujian Mo, haiyong Shi, and Wenjing Yao. 2023. Structural feature representation and fusion of human spatial cooperative motion for action recognition. Multimedia Systems, Vol. 29, 3 (2023), 1301--1314. https://doi.org/10.1007/s00530-023-01054--5

Digital Library

[6]

R. Qi Charles, Hao Su, Mo Kaichun, and Leonidas J. Guibas. 2017. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Hawaii, USA, 77--85. https://doi.org/10.1109/CVPR.2017.16

[7]

Chen Chen, Zhe Chen, Jing Zhang, and Dacheng Tao. 2022. SASA: Semantics-Augmented Set Abstraction for Point-Based 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. AAAI, Vancouver, Canada, 221--229. https://doi.org/10.1609/aaai.v36i1.19897

[8]

Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, and Jiaya Jia. 2022. Focal Sparse Convolutional Networks for 3D Object Detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 5418--5427. https://doi.org/10.1109/CVPR52688.2022.00535

[9]

Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, and Jiaya Jia. 2023. LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 13488--13498. https://doi.org/10.1109/CVPR52729.2023.01296

[10]

Zhe Chen, Jing Zhang, and Dacheng Tao. 2019. Progressive LiDAR adaptation for road detection. IEEE/CAA Journal of Automatica Sinica, Vol. 6, 3 (2019), 693--702. https://doi.org/10.1109/JAS.2019.1911459

[11]

Jeffrey Delmerico, Titus Cieslewski, Henri Rebecq, Matthias Faessler, and Davide Scaramuzza. 2019. Are We Ready for Autonomous Drone Racing? The UZH-FPV Drone Racing Dataset. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, Montreal, Canada, 6713--6719. https://doi.org/10.1109/ICRA.2019.8793887

Digital Library

[12]

Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, and Houqiang Li. 2021. Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. AAAI, 1201--1209. https://doi.org/10.1609/aaai.v35i2.16207

[13]

Groh Fabian, Wieschollek Patrick, and P. A. Lensch Hendrik. 2018. Flex-Convolution - Million-Scale Point-Cloud Learning Beyond Grid- Worlds. In Asian Conference on Computer Vision. Springer, Cham, Perth, Australia, 105--122. https://doi.org/10.1007/978--3-030--20887--5_7

[14]

Benjamin Graham, Martin Engelcke, and Laurens van der Maaten. 2018. 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, Utah, USA, 9224--9232. https://doi.org/10.1109/CVPR.2018.00961

[15]

Hiep Anh Hoang, Duy Cuong Bui, and Myungsik Yoo. 2024. TSSTDet: Transformation-Based 3-D Object Detection via a Spatial Shape Transformer. IEEE Sensors Journal, Vol. 24, 5 (2024), 7126--7139. https://doi.org/10.1109/JSEN.2024.3350770

[16]

Hiep Anh Hoang and Myungsik Yoo. 2023. 3ONet: 3-D Detector for Occluded Object Under Obstructed Conditions. IEEE Sensors Journal, Vol. 23, 16 (2023), 18879--18892. https://doi.org/10.1109/JSEN.2023.3293515

[17]

Mu Hu, Shuling Wang, Bin Li, Shiyu Ning, Li Fan, and Xiaojin Gong. 2021. PENet: Towards Precise and Efficient Image Guided Depth Completion. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, Xi'an, China, 13656--13662. https://doi.org/10.1109/ICRA48506.2021.9561035

Digital Library

[18]

Saif Imran, Xiaoming Liu, and Daniel Morris. 2021. Depth Completion With Twin Surface Extrapolation at Occlusion Boundaries. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2583--2592.

[19]

Jordan S. K. Hu, Tianshu Kuai, and Steven L Waslander. 2022. Point Density-Aware Voxels for LiDAR 3D Object Detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 8459--8468. https://doi.org/10.1109/CVPR52688.2022.00828

[20]

Jinyu Li, Chenxu Luo, and Xiaodong Yang. 2023. PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 17567--17576. https://doi.org/10.1109/CVPR52729.2023.01685

[21]

Xin Li, Tao Ma, Yuenan Hou, Botian Shi, Yuchen Yang, Youquan Liu, Xingjiao Wu, Qin Chen, Yikang Li, Yu Qiao, and Liang He. 2023. LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 13488--13498. https://doi.org/10.1109/CVPR52729.2023.01681

[22]

Jianhui Liu, Yukang Chen, Xiaoqing Ye, Zhuotao Tian, Xiao Tan, and Xiaojuan Qi. 2022. Spatial Pruned Sparse Convolution for Efficient 3D Object Detection. In Advances in Neural Information Processing Systems, Vol. 35. Curran Associates, Inc., New Orleans, USA, 6735--6748.

[23]

Zongdai Liu, Dingfu Zhou, Feixiang Lu, Jin Fang, and Liangjun Zhang. 2021. AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 15621--15630. https://doi.org/10.1109/iccv48922.2021.01535

[24]

Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Junjie Yan, and Wanli Ouyang. 2021. Geometry Uncertainty Projection Network for Monocular 3D Object Detection. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 3091--3101. https://doi.org/10.1109/iccv48922.2021.00310

[25]

Charles Ruizhongtai Qi, Li Yi, Hao Su, and LeonidasJ Guibas. 2017. PointNet: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc., Long Beach, California, USA.

[26]

David Schinagl, Georg Krispel, Horst Possegger, Peter M. Roth, and Horst Bischof. 2022. OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 1131--1140. https://doi.org/10.1109/CVPR52688.2022.00121

[27]

Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, and Hongsheng Li. 2020. PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, USA, 10526--10535. https://doi.org/10.1109/cvpr42600.2020.01054

[28]

Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, and Hongsheng Li. 2023. PV-RCNN: Point-voxel feature set abstraction with local vector representation for 3D object detection. International Journal of Computer Vision, Vol. 131, 2 (2023), 531--551.

Digital Library

[29]

OpenPCDet Development Team. 2020. OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Clouds. https://github.com/open-mmlab/OpenPCDet.

[30]

Tai Wang, Xinge ZHU, Jiangmiao Pang, and Dahua Lin. 2021. Probabilistic and Geometric Depth: Detecting Objects in Perspective. In Proceedings of the 5th Conference on Robot Learning, Vol. 164. PMLR, 1475--1485.

[31]

Xinshuo Weng and Kris Kitani. 2019. Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, Seoul, South Korea, 857--866. https://doi.org/10.1109/ICCVW.2019.00114

[32]

Alex Wong and Stefano Soatto. 2021. Unsupervised Depth Completion With Calibrated Backprojection Layers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 12747--12756.

[33]

Hai Wu, Jinhao Deng, Chenglu Wen, Xin Li, Cheng Wang, and Jonathan Li. 2022. CasA: A Cascade Attention Network for 3-D Object Detection From LiDAR Point Clouds. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2022), 1--11. https://doi.org/10.1109/TGRS.2022.3203163

[34]

Hai Wu, Chenglu Wen, Wei Li, Xin Li, Ruigang Yang, and cheng Wang. 2023. Transformation-Equivariant 3D Object Detection for Autonomous Driving. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37(3). AAAI, Washington D.C., USA, 2795--2802. https://doi.org/10.1609/aaai.v37i3.25380

Digital Library

[35]

Hai Wu, Chenglu Wen, Shaoshuai Shi, Xin Li, and Cheng Wang. 2023. Virtual Sparse Convolution for Multimodal 3D Object Detection. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 21653--21662. https://doi.org/10.1109/CVPR52729.2023.02074

[36]

Xiaopei Wu, Liang Peng, Honghui Yang, Liang Xie, Chenxi Huang, Chengqi Deng, Haifeng Liu, and Deng Cai. 2022. Sparse Fuse Dense: Towards High Quality 3D Detection With Depth Completion. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 5418--5427.

[37]

Qiming Xia, Yidong Chen, Guorong Cai, Guikun Chen, Daoshun Xie, Jinhe Su, and Zongyue Wang. 2023. 3-D HANet: A Flexible 3-D Heatmap Auxiliary Network for Object Detection. IEEE Transactions on Geoscience and Remote Sensing, Vol. 61 (2023), 1--13. https://doi.org/10.1109/TGRS.2023.3250229

[38]

Honghui Yang, Zili Liu, Xiaopei Wu, Wenxiao Wang, Wei Qian, Xiaofei He, and Deng Cai. 2022. Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. In 2022 European Conference on Computer Vision(ECCV). Springer, Cham, Tel Aviv, Israel, 662--679. https://doi.org/10.1007/978--3-031--20074--8_38

Digital Library

[39]

Honghui Yang, Wenxiao Wang, Minghao Chen, Binbin Lin, Tong He, Hua Chen, Xiaofei He, and Wanli Ouyang. 2023. PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, Canada, 13476--13487. https://doi.org/10.1109/CVPR52729.2023.01295

[40]

Jiancheng Yang, Qiang Zhang, Bingbing Ni, Linguo Li, Jinxian Liu, Mengdie Zhou, and Qi Tian. 2019. Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Long Beach, CA, USA, 3318--3327. https://doi.org/10.1109/CVPR.2019.00344

[41]

Zetong Yang, Yanan Sun, Shu Liu, and Jiaya Jia. 2020. 3DSSD: Point-Based 3D Single Stage Object Detector. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, USA, 11037--11045. https://doi.org/10.1109/CVPR42600.2020.01105

[42]

Tianwei Yin, Xingyi Zhou, and Philipp Krähenbühl. 2021. Multimodal Virtual Point 3D Detection. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 16494--16507.

[43]

Tianwei Yin, Xingyi Zhou, and Philipp Krähenbühl. 2021. Center-based 3D Object Detection and Tracking. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 11779--11788. https://doi.org/10.1109/CVPR46437.2021.01161

[44]

Gang Zhang, Junnan Chen, Guohuan Gao, Jianmin Li Li, Si Liu, and Xiaolin Hu. 2024. SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, USA, 14477--14486.

[45]

Yifan Zhang, Qingyong Hu, Guoquan Xu, Yanxin Ma, Jianwei Wan, and Yulan Guo. 2022. Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, Louisiana, USA, 18931--18940. https://doi.org/10.1109/CVPR52688.2022.01838

[46]

Yifan Zhang, Qijian Zhang, Zhiyu Zhu, Junhui Hou, and Yuan Yixuan. 2023. GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation. International Journal of Computer Vision, Vol. 131, 12 (2023), 3332--3352. https://doi.org/10.1007/s11263-023-01869--9

Digital Library

[47]

Tianchen Zhao, Xuefei Ning, Ke Hong, Zhongyuan Qiu, Pu Lu, Yali Zhao, Linfeng Zhang, Lipu Zhou, Guohao Dai, Huazhong Yang, and Yu Wang. 2023. Ada3D: Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Paris, France, 17728--17738.

[48]

Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. 2018. Open3D: A Modern Library for 3D Data Processing. arXiv:1801.09847 (2018).

Index Terms

Sparse Query Dense: Enhancing 3D Object Detection with Pseudo Points
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
Computer Vision – ECCV 2022
Abstract
3D object detection in point clouds is a core component for modern robotics and autonomous driving systems. A key challenge in 3D object detection comes from the inherent sparse nature of point occupancy within the 3D scene. In this paper, we ...
Object tracking via dense SIFT features and low-rank representation
Abstract
In this paper, we present a low-rank sparse tracking method which builds upon the particle filtering framework. The proposed method learns the local dense scale-invariant feature transform features corresponding to candidate samples jointly by ...
Sparse Signal Reconstruction via Iterative Support Detection

We present a novel sparse signal reconstruction method, iterative support detection (ISD), aiming to achieve fast reconstruction and a reduced requirement on the number of measurements compared to the classical $\ell_1$ minimization approach. ISD ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key Research and Development Program of China

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
103
Total Downloads

Downloads (Last 12 months)103
Downloads (Last 6 weeks)82

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents