More Web Proxy on the site http://driver.im/

research-article

Dig into Detailed Structures: Key Context Encoding and Semantic-based Decoding for Point Cloud Completion

Authors:

Yang YangAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 6686 - 6695

https://doi.org/10.1145/3664647.3680565

Published: 28 October 2024 Publication History

Abstract

Recovering the complete shape of a 3D object from limited viewpoints plays an important role in 3D vision. Recent point cloud completion methods prefer an encoding-decoding architecture for generating the global structure and local geometry from a set of input point proxies. In this paper, we introduce an innovative completion method aimed at uncovering structural details from input point clouds and maximizing their utility. Specifically, we improve both Encoding and Decoding for this task: (1) Key Context Fusion Encoding extracts and aggregates homologous key context by adaptively increasing the sampling bias towards salient structure and special contour points. (2) Semantic-based Decoding introduces a semantic EdgeConv module to prompt next Transformer decoder, which effectively learns and generates local geometry with semantic correlations from non-nearest neighbors. The experiments are evaluated on several 3D point cloud and 2.5D depth image datasets. Both qualitative and quantitative evaluations demonstrate that our method outperforms previous state-of-the-art methods.

References

[1]

Syeda M. Ahmed, Yan-Zhi Tan, Chee-Meng Chew, Abdullah Al-Mamun, and Fook-Seng Wong. 2018. Edge and Corner Detection for Unorganized 3D Point Clouds with Application to Robotic Welding. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE Press, 7350--7355. https://doi.org/10.1109/IROS.2018.8593910

Digital Library

[2]

Rui Cao, Kaiyi Zhang, Yang Chen, Ximing Yang, and Cheng Jin. 2022. Point Cloud Completion via Multi-Scale Edge Convolution and Attention. In Proceedings of the 30th ACM International Conference on Multimedia (MM '22). Association for Computing Machinery, New York, NY, USA, 6183--6192. https://doi.org/10.1145/3503161.3548360

Digital Library

[3]

Junxian Chen, Ying Liu, Yiqi Liang, Dandan Long, Xiaolin He, and Ruihui Li. 2023. SD-Net: Spatially-Disentangled Point Cloud Completion Network. In Proceedings of the 31st ACM International Conference on Multimedia (MM '23). Association for Computing Machinery, New York, NY, USA, 1283--1293. https://doi.org/10.1145/3581783.3611716

Digital Library

[4]

Siheng Chen, Baoan Liu, Chen Feng, Carlos Vallespi-Gonzalez, and Carl K. Wellington. 2020. 3D Point Cloud Processing and Learning for Autonomous Driving: Impacting Map Creation, Localization, and Perception. IEEE Signal Processing Magazine, Vol. 38, 1 (2020), 68--86. https://doi.org/10.1109/MSP.2020.2984780

[5]

Zhikai Chen, Fuchen Long, Zhaofan Qiu, Ting Yao, Wengang Zhou, Jiebo Luo, and Tao Mei. 2023. AnchorFormer: Point Cloud Completion from Discriminative Nodes. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 13581--13590. https://doi.org/10.1109/CVPR52729.2023.01305

[6]

Sungjoon Choi, Qian-Yi Zhou, Stephen Miller, and Vladlen Koltun. 2016. A Large Dataset of Object Scans. ArXiv, Vol. abs/1602.02481 (2016).

[7]

Angela Dai, Charles R. Qi, and Matthias Nießner. 2017. Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6545--6554. https://doi.org/10.1109/CVPR.2017.693

[8]

Yuval Eldar, Michael Lindenbaum, Moshe Porat, and Yehoshua Y. Zeevi. 1997. The Farthest Point Strategy for Progressive Image Sampling. IEEE Transactions on Image Processing, Vol. 6, 9 (1997), 1305--1315. https://doi.org/10.1109/83.623193

Digital Library

[9]

Haoqiang Fan, Hao Su, and Leonidas J. Guibas. 2017. A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2463--2471. https://doi.org/10.1109/CVPR.2017.264

[10]

Xuehao Gao, Shaoyi Du, Yang Wu, and Yang Yang. 2023. Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6451--6460. https://doi.org/10.1109/CVPR52729.2023.00624

[11]

Xuehao Gao, Yang Yang, Yang Wu, and Shaoyi Du. 2023. Learning Heterogeneous Spatial--Temporal Context for Skeleton-Based Action Recognition. IEEE Transactions on Neural Networks and Learning Systems (2023), 1--12. https://doi.org/10.1109/TNNLS.2023.3252172

[12]

Xuehao Gao, Yang Yang, Zhenyu Xie, Shaoyi Du, Zhongqian Sun, and Yang Wu. 2024. GUESS: GradUally Enriching SyntheSis for Text-Driven Human Motion Generation. IEEE Transactions on Visualization and Computer Graphics (2024), 1--13. https://doi.org/10.1109/TVCG.2024.3352002

[13]

Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3354--3361. https://doi.org/10.1109/CVPR.2012.6248074

[14]

Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, and Mathieu Aubry. 2018. A Papier-Mache Approach to Learning 3D Surface Generation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 216--224. https://doi.org/10.1109/CVPR.2018.00030

[15]

Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R. Martin, and Shi-Min Hu. 2021. PCT: Point Cloud Transformer. Computational Visual Media, Vol. 7, 2 (June 2021), 187--199. https://doi.org/10.1007/s41095-021-0229--5

[16]

Xiaoguang Han, Zhen Li, Haibin Huang, Evangelos Kalogerakis, and Yizhou Yu. 2017. High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference. In 2017 IEEE International Conference on Computer Vision (ICCV). 85--93. https://doi.org/10.1109/ICCV.2017.19

[17]

Zitian Huang, Yikuan Yu, Jiawen Xu, Feng Ni, and Xinyi Le. 2020. PF-Net: Point Fractal Network for 3D Point Cloud Completion. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7659--7667. https://doi.org/10.1109/CVPR42600.2020.00768

[18]

Stephen James, Kentaro Wada, Tristan Laidlow, and Andrew J. Davison. 2022. Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13729--13738. https://doi.org/10.1109/CVPR52688.2022.01337

[19]

Truc Le and Ye Duan. 2018. PointGrid: A Deep Network for 3D Shape Understanding. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Salt Lake City, UT, 9204--9214. https://doi.org/10.1109/CVPR.2018.00959

[20]

Shanshan Li, Pan Gao, Xiaoyang Tan, and Mingqiang Wei. 2023. ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9466--9475. https://doi.org/10.1109/CVPR52729.2023.00913

[21]

Ming Liang, Binh Yang, Shenlong Wang, and Raquel Urtasun. 2018. Deep Continuous Fusion for Multi-sensor 3D Object Detection. In 2018 European Conference on Computer Vision (ECCV). https://doi.org/10.1007/978--3-030-01270-0_39

Digital Library

[22]

Chen-Hsuan Lin, Chen Kong, and Simon Lucey. 2017. Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction. ArXiv, Vol. abs/1706.07036 (2017).

[23]

Yinyu Nie, Ji Hou, Xiaoguang Han, and Matthias Nießner. 2020. RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4606--4616. https://doi.org/10.1109/CVPR46437.2021.00458

[24]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS'19). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

[25]

Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 77--85. https://doi.org/10.1109/CVPR.2017.16

[26]

Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017. PointNet: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). 5105--5114. https://proceedings.neurips.cc/paper_files/paper/2017/file/d8bf84be3800d12f74d8b05e9b89836f-Paper.pdf

[27]

David Stutz and Andreas Geiger. 2018. Learning 3D Shape Completion from Laser Scan Data with Weak Supervision. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1955--1964. https://doi.org/10.1109/CVPR.2018.00209

[28]

Maxim Tatarchenko, Stephan R. Richter, René Ranftl, Zhuwen Li, Vladlen Koltun, and Thomas Brox. 2019. What Do Single-View 3D Reconstruction Networks Learn?. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3400--3409. https://doi.org/10.1109/CVPR.2019.00352

[29]

Lyne P. Tchapmi, Vineet Kosaraju, Hamid Rezatofighi, Ian Reid, and Silvio Savarese. 2019. TopNet: Structural Point Cloud Decoder. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 383--392. https://doi.org/10.1109/CVPR.2019.00047

[30]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). 6000--6010. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Digital Library

[31]

Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, and Xin Tong. 2017. O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. ACM Trans. Graph., Vol. 36, 4, Article 72 (jul 2017), 11 pages. https://doi.org/10.1145/3072959.3073608

Digital Library

[32]

Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. 2019. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans. Graph., Vol. 38, 5, Article 146 (oct 2019), 12 pages. https://doi.org/10.1145/3326362

Digital Library

[33]

Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, and Yu-Shen Liu. 2023. PMP-Net: Point Cloud Completion by Transformer-Enhanced Multi-Step Point Moving Paths. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 1 (2023), 852--867. https://doi.org/10.1109/TPAMI.2022.3159003

[34]

Yaqi Xia, Yan Xia, Wei Li, Rui Song, Kailang Cao, and Uwe Stilla. 2021. ASFM-Net: Asymmetrical Siamese Feature Matching Network for Point Completion. In Proceedings of the 29th ACM International Conference on Multimedia (MM '21). 1938--1947. https://doi.org/10.1145/3474085.3475348

Digital Library

[35]

Peng Xiang, Xin Wen, Yu-Shen Liu, Yan-Pei Cao, Pengfei Wan, Wen Zheng, and Zhizhong Han. 2021. SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 5479--5489.

[36]

Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, and Wenxiu Sun. 2020. GRNet: Gridding Residual Network for Dense Point Cloud Completion. In 2020 European Conference on Computer Vision (ECCV). https://doi.org/10.1007/978--3-030--58545--7_21

[37]

Zhenyu Xie, Yang Wu, Xuehao Gao, Zhongqian Sun, Wei Yang, and Xiaodan Liang. 2023. Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model. ArXiv, Vol. abs/2312.10960 (2023).

[38]

Rui Xu, Le Hui, Yuehui Han, Jianjun Qian, and Jin Xie. 2023. Transformer-based Point Cloud Generation Network. In Proceedings of the 31st ACM International Conference on Multimedia (MM '23). 4169--4177. https://doi.org/10.1145/3581783.3612226

Digital Library

[39]

Yaoqing Yang, Chen Feng, Yiru Shen, and Dong Tian. 2018. FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 206--215. https://doi.org/10.1109/CVPR.2018.00029

[40]

Yang Yang, Guangjun Liu, and Xuehao Gao. 2022. Motion Guided Attention Learning for Self-Supervised 3D Human Action Recognition. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 12 (2022), 8623--8634. https://doi.org/10.1109/TCSVT.2022.3194350

[41]

Xumin Yu, Yongming Rao, Ziyi Wang, Zuyan Liu, Jiwen Lu, and Jie Zhou. 2021. PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 12478--12487. https://doi.org/10.1109/ICCV48922.2021.01227

[42]

Xumin Yu, Yongming Rao, Ziyi Wang, Jiwen Lu, and Jie Zhou. 2023. AdaPoinTr: Diverse Point Cloud Completion With Adaptive Geometry-Aware Transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 12 (2023), 14114--14130. https://doi.org/10.1109/TPAMI.2023.3309253

Digital Library

[43]

Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, and Martial Hebert. 2018. PCN: Point Completion Network. In 2018 International Conference on 3D Vision (3DV). 728--737. https://doi.org/10.1109/3DV.2018.00088

[44]

Wenxiao Zhang, Qingan Yan, and Chunxia Xiao. 2020. Detail Preserved Point Cloud Completion via Separated Feature Aggregation. ArXiv, Vol. abs/2007.02374 (2020).

[45]

Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, and Vladlen Koltun. 2021. Point Transformer. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 16239--16248. https://doi.org/10.1109/ICCV48922.2021.01595

[46]

Haoran Zhou, Yun Cao, Wenqing Chu, Junwei Zhu, Tong Lu, Ying Tai, and Chengjie Wang. 2022. SeedFormer: Patch Seeds Based Point Cloud Completion with Upsample Transformer. In 2022 European Conference on Computer Vision (ECCV). 416--432. https://doi.org/10.1007/978--3-031--20062--5_24

Index Terms

Dig into Detailed Structures: Key Context Encoding and Semantic-based Decoding for Point Cloud Completion
1. Computing methodologies
  1. Computer graphics
    1. Shape modeling

Recommendations

SAUM: Symmetry-Aware Upsampling Module for Consistent Point Cloud Completion
Computer Vision – ACCV 2020
Abstract
Point cloud completion estimates the complete shape given incomplete point cloud, which is a crucial task as the raw point cloud measurements suffer from missing data. Most of previous methods for point cloud completion share the encoder-decoder ...
A point contextual transformer network for point cloud completion
Abstract
Point cloud completion is an essential task for recovering a complete point cloud from its partial observation to support downstream applications, such as object detection and reconstruction. Existing point cloud completion networks primarily ...
Structure-Aware Point Cloud Completion
Image and Graphics
Abstract
Structure plays a crucial role in point cloud completion. While many efforts have been made to recover geometric details of the target shape, it is non-trivial to recover global structures, especially when large areas are missing in the input ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key Research and Development Program of China

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
30
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)30

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents