[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3664647.3680565acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Dig into Detailed Structures: Key Context Encoding and Semantic-based Decoding for Point Cloud Completion

Published: 28 October 2024 Publication History

Abstract

Recovering the complete shape of a 3D object from limited viewpoints plays an important role in 3D vision. Recent point cloud completion methods prefer an encoding-decoding architecture for generating the global structure and local geometry from a set of input point proxies. In this paper, we introduce an innovative completion method aimed at uncovering structural details from input point clouds and maximizing their utility. Specifically, we improve both Encoding and Decoding for this task: (1) Key Context Fusion Encoding extracts and aggregates homologous key context by adaptively increasing the sampling bias towards salient structure and special contour points. (2) Semantic-based Decoding introduces a semantic EdgeConv module to prompt next Transformer decoder, which effectively learns and generates local geometry with semantic correlations from non-nearest neighbors. The experiments are evaluated on several 3D point cloud and 2.5D depth image datasets. Both qualitative and quantitative evaluations demonstrate that our method outperforms previous state-of-the-art methods.

References

[1]
Syeda M. Ahmed, Yan-Zhi Tan, Chee-Meng Chew, Abdullah Al-Mamun, and Fook-Seng Wong. 2018. Edge and Corner Detection for Unorganized 3D Point Clouds with Application to Robotic Welding. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE Press, 7350--7355. https://doi.org/10.1109/IROS.2018.8593910
[2]
Rui Cao, Kaiyi Zhang, Yang Chen, Ximing Yang, and Cheng Jin. 2022. Point Cloud Completion via Multi-Scale Edge Convolution and Attention. In Proceedings of the 30th ACM International Conference on Multimedia (MM '22). Association for Computing Machinery, New York, NY, USA, 6183--6192. https://doi.org/10.1145/3503161.3548360
[3]
Junxian Chen, Ying Liu, Yiqi Liang, Dandan Long, Xiaolin He, and Ruihui Li. 2023. SD-Net: Spatially-Disentangled Point Cloud Completion Network. In Proceedings of the 31st ACM International Conference on Multimedia (MM '23). Association for Computing Machinery, New York, NY, USA, 1283--1293. https://doi.org/10.1145/3581783.3611716
[4]
Siheng Chen, Baoan Liu, Chen Feng, Carlos Vallespi-Gonzalez, and Carl K. Wellington. 2020. 3D Point Cloud Processing and Learning for Autonomous Driving: Impacting Map Creation, Localization, and Perception. IEEE Signal Processing Magazine, Vol. 38, 1 (2020), 68--86. https://doi.org/10.1109/MSP.2020.2984780
[5]
Zhikai Chen, Fuchen Long, Zhaofan Qiu, Ting Yao, Wengang Zhou, Jiebo Luo, and Tao Mei. 2023. AnchorFormer: Point Cloud Completion from Discriminative Nodes. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 13581--13590. https://doi.org/10.1109/CVPR52729.2023.01305
[6]
Sungjoon Choi, Qian-Yi Zhou, Stephen Miller, and Vladlen Koltun. 2016. A Large Dataset of Object Scans. ArXiv, Vol. abs/1602.02481 (2016).
[7]
Angela Dai, Charles R. Qi, and Matthias Nießner. 2017. Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6545--6554. https://doi.org/10.1109/CVPR.2017.693
[8]
Yuval Eldar, Michael Lindenbaum, Moshe Porat, and Yehoshua Y. Zeevi. 1997. The Farthest Point Strategy for Progressive Image Sampling. IEEE Transactions on Image Processing, Vol. 6, 9 (1997), 1305--1315. https://doi.org/10.1109/83.623193
[9]
Haoqiang Fan, Hao Su, and Leonidas J. Guibas. 2017. A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2463--2471. https://doi.org/10.1109/CVPR.2017.264
[10]
Xuehao Gao, Shaoyi Du, Yang Wu, and Yang Yang. 2023. Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6451--6460. https://doi.org/10.1109/CVPR52729.2023.00624
[11]
Xuehao Gao, Yang Yang, Yang Wu, and Shaoyi Du. 2023. Learning Heterogeneous Spatial--Temporal Context for Skeleton-Based Action Recognition. IEEE Transactions on Neural Networks and Learning Systems (2023), 1--12. https://doi.org/10.1109/TNNLS.2023.3252172
[12]
Xuehao Gao, Yang Yang, Zhenyu Xie, Shaoyi Du, Zhongqian Sun, and Yang Wu. 2024. GUESS: GradUally Enriching SyntheSis for Text-Driven Human Motion Generation. IEEE Transactions on Visualization and Computer Graphics (2024), 1--13. https://doi.org/10.1109/TVCG.2024.3352002
[13]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3354--3361. https://doi.org/10.1109/CVPR.2012.6248074
[14]
Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, and Mathieu Aubry. 2018. A Papier-Mache Approach to Learning 3D Surface Generation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 216--224. https://doi.org/10.1109/CVPR.2018.00030
[15]
Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R. Martin, and Shi-Min Hu. 2021. PCT: Point Cloud Transformer. Computational Visual Media, Vol. 7, 2 (June 2021), 187--199. https://doi.org/10.1007/s41095-021-0229--5
[16]
Xiaoguang Han, Zhen Li, Haibin Huang, Evangelos Kalogerakis, and Yizhou Yu. 2017. High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference. In 2017 IEEE International Conference on Computer Vision (ICCV). 85--93. https://doi.org/10.1109/ICCV.2017.19
[17]
Zitian Huang, Yikuan Yu, Jiawen Xu, Feng Ni, and Xinyi Le. 2020. PF-Net: Point Fractal Network for 3D Point Cloud Completion. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7659--7667. https://doi.org/10.1109/CVPR42600.2020.00768
[18]
Stephen James, Kentaro Wada, Tristan Laidlow, and Andrew J. Davison. 2022. Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13729--13738. https://doi.org/10.1109/CVPR52688.2022.01337
[19]
Truc Le and Ye Duan. 2018. PointGrid: A Deep Network for 3D Shape Understanding. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Salt Lake City, UT, 9204--9214. https://doi.org/10.1109/CVPR.2018.00959
[20]
Shanshan Li, Pan Gao, Xiaoyang Tan, and Mingqiang Wei. 2023. ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9466--9475. https://doi.org/10.1109/CVPR52729.2023.00913
[21]
Ming Liang, Binh Yang, Shenlong Wang, and Raquel Urtasun. 2018. Deep Continuous Fusion for Multi-sensor 3D Object Detection. In 2018 European Conference on Computer Vision (ECCV). https://doi.org/10.1007/978--3-030-01270-0_39
[22]
Chen-Hsuan Lin, Chen Kong, and Simon Lucey. 2017. Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction. ArXiv, Vol. abs/1706.07036 (2017).
[23]
Yinyu Nie, Ji Hou, Xiaoguang Han, and Matthias Nießner. 2020. RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4606--4616. https://doi.org/10.1109/CVPR46437.2021.00458
[24]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS'19). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
[25]
Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 77--85. https://doi.org/10.1109/CVPR.2017.16
[26]
Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017. PointNet: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). 5105--5114. https://proceedings.neurips.cc/paper_files/paper/2017/file/d8bf84be3800d12f74d8b05e9b89836f-Paper.pdf
[27]
David Stutz and Andreas Geiger. 2018. Learning 3D Shape Completion from Laser Scan Data with Weak Supervision. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1955--1964. https://doi.org/10.1109/CVPR.2018.00209
[28]
Maxim Tatarchenko, Stephan R. Richter, René Ranftl, Zhuwen Li, Vladlen Koltun, and Thomas Brox. 2019. What Do Single-View 3D Reconstruction Networks Learn?. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3400--3409. https://doi.org/10.1109/CVPR.2019.00352
[29]
Lyne P. Tchapmi, Vineet Kosaraju, Hamid Rezatofighi, Ian Reid, and Silvio Savarese. 2019. TopNet: Structural Point Cloud Decoder. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 383--392. https://doi.org/10.1109/CVPR.2019.00047
[30]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). 6000--6010. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[31]
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, and Xin Tong. 2017. O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. ACM Trans. Graph., Vol. 36, 4, Article 72 (jul 2017), 11 pages. https://doi.org/10.1145/3072959.3073608
[32]
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. 2019. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans. Graph., Vol. 38, 5, Article 146 (oct 2019), 12 pages. https://doi.org/10.1145/3326362
[33]
Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, and Yu-Shen Liu. 2023. PMP-Net: Point Cloud Completion by Transformer-Enhanced Multi-Step Point Moving Paths. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 1 (2023), 852--867. https://doi.org/10.1109/TPAMI.2022.3159003
[34]
Yaqi Xia, Yan Xia, Wei Li, Rui Song, Kailang Cao, and Uwe Stilla. 2021. ASFM-Net: Asymmetrical Siamese Feature Matching Network for Point Completion. In Proceedings of the 29th ACM International Conference on Multimedia (MM '21). 1938--1947. https://doi.org/10.1145/3474085.3475348
[35]
Peng Xiang, Xin Wen, Yu-Shen Liu, Yan-Pei Cao, Pengfei Wan, Wen Zheng, and Zhizhong Han. 2021. SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 5479--5489.
[36]
Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, and Wenxiu Sun. 2020. GRNet: Gridding Residual Network for Dense Point Cloud Completion. In 2020 European Conference on Computer Vision (ECCV). https://doi.org/10.1007/978--3-030--58545--7_21
[37]
Zhenyu Xie, Yang Wu, Xuehao Gao, Zhongqian Sun, Wei Yang, and Xiaodan Liang. 2023. Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model. ArXiv, Vol. abs/2312.10960 (2023).
[38]
Rui Xu, Le Hui, Yuehui Han, Jianjun Qian, and Jin Xie. 2023. Transformer-based Point Cloud Generation Network. In Proceedings of the 31st ACM International Conference on Multimedia (MM '23). 4169--4177. https://doi.org/10.1145/3581783.3612226
[39]
Yaoqing Yang, Chen Feng, Yiru Shen, and Dong Tian. 2018. FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 206--215. https://doi.org/10.1109/CVPR.2018.00029
[40]
Yang Yang, Guangjun Liu, and Xuehao Gao. 2022. Motion Guided Attention Learning for Self-Supervised 3D Human Action Recognition. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 12 (2022), 8623--8634. https://doi.org/10.1109/TCSVT.2022.3194350
[41]
Xumin Yu, Yongming Rao, Ziyi Wang, Zuyan Liu, Jiwen Lu, and Jie Zhou. 2021. PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 12478--12487. https://doi.org/10.1109/ICCV48922.2021.01227
[42]
Xumin Yu, Yongming Rao, Ziyi Wang, Jiwen Lu, and Jie Zhou. 2023. AdaPoinTr: Diverse Point Cloud Completion With Adaptive Geometry-Aware Transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 12 (2023), 14114--14130. https://doi.org/10.1109/TPAMI.2023.3309253
[43]
Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, and Martial Hebert. 2018. PCN: Point Completion Network. In 2018 International Conference on 3D Vision (3DV). 728--737. https://doi.org/10.1109/3DV.2018.00088
[44]
Wenxiao Zhang, Qingan Yan, and Chunxia Xiao. 2020. Detail Preserved Point Cloud Completion via Separated Feature Aggregation. ArXiv, Vol. abs/2007.02374 (2020).
[45]
Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, and Vladlen Koltun. 2021. Point Transformer. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 16239--16248. https://doi.org/10.1109/ICCV48922.2021.01595
[46]
Haoran Zhou, Yun Cao, Wenqing Chu, Junwei Zhu, Tong Lu, Ying Tai, and Chengjie Wang. 2022. SeedFormer: Patch Seeds Based Point Cloud Completion with Upsample Transformer. In 2022 European Conference on Computer Vision (ECCV). 416--432. https://doi.org/10.1007/978--3-031--20062--5_24

Index Terms

  1. Dig into Detailed Structures: Key Context Encoding and Semantic-based Decoding for Point Cloud Completion

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3d-context
    2. generative model
    3. point cloud completion

    Qualifiers

    • Research-article

    Funding Sources

    • National Key Research and Development Program of China

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 30
      Total Downloads
    • Downloads (Last 12 months)30
    • Downloads (Last 6 weeks)30
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media