Abstract
Point cloud completion aims to restore full shapes of objects from their partial views obtained by 3D optical scanners. In order to make point cloud completion become more robust to azimuthal rotations and more adaptive to real-world scenarios, we propose a novel network for simultaneous rotation invariant and equivariant completion with no need of data augmentation, while other existing approaches require separately trained models for different completion types. Our method includes several main steps: First, Density Compensation Mapping (DCM) as well as Aggregative Gaussian Gridding (AGG) modules are introduced to transfer partial point clouds to spherical signals and avoid unbalanced sampling. Second, an encoder based on group correlation is designed to extract rotation invariant global features and equivariant azimuthal features from spherical signals. Third, parallel groups of decoders are proposed to realize rotation invariant completion based on feature fusion. Finally, a feature remapping module as well as Pose Voting Alignment (PVA) algorithm are proposed to unify feature space and realize rotation equivariant completion. Based on these modules, we find that the application of group correlation can be extended to the domain of shape completion; equivariant and invariant completions can be unified in one pipeline, and our inherent rotation equivariant and invariant framework can achieve competitive performances when comparing with existing representative methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets used and analyzed during the current study are public and available at: https://shapenet.org/, https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html and https://apolloscape.auto/.
Notes
Detailed explanations of some major steps in Algorithm 1 are recorded in “Appendix 3”.
Codes will be available at https://github.com/HangWu2020/RISC.
The calculation methods of FLOPs are recorded in “Appendix 4”.
References
Song S, Yu F, Zeng A, Chang AX, Savva M, Funkhouser T (2017) Semantic scene completion from a single depth image. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 190–198. https://doi.org/10.1109/CVPR.2017.28
Varley J, DeChant C, Richardson A, Ruales J, Allen P (2017) Shape completion enabled robotic grasping. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 2442–2447. https://doi.org/10.1109/IROS.2017.8206060
Charles RQ, Su H, Kaichun M, Guibas LJ (2017) PointNet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 77–85 . https://doi.org/10.1109/CVPR.2017.16
Qi CR, Yi L, Su H, Guibas L(2017) PointNet++: deep hierarchical feature learning on point sets in a metric space. Adv Neural Inf Process Syst
Li Y, Bu R, Sun M, Wu W, Di X, Chen B(2018) PointCNN: convolution on X-transformed points. Adv Neural Inf Process Syst
Achlioptas P, Diamanti O, Mitliagkas I, Guibas L (2018) Learning representations and generative models for 3D point clouds. In: 2018 Proceedings of the 35th international conference on machine learning (ICML), pp 40–49
Groueix T, Fisher M, Kim VG, Russell BC, Aubry M (2018) A papier-mache approach to learning 3D surface generation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 216–224. https://doi.org/10.1109/CVPR.2018.00030
Yang Y, Feng C, Shen Y, Tian D (2018) FoldingNet: point cloud auto-encoder via deep grid deformation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 206–215. https://doi.org/10.1109/CVPR.2018.00029
Yuan W, Khot T, Held D, Mertz C, Hebert M (2018) PCN: point completion network. In: 2018 International conference on 3D vision (3DV), pp 728–737 . https://doi.org/10.1109/3DV.2018.00088
Tchapmi LP, Kosaraju V, Rezatofighi H, Reid I, Savarese S (2019) TopNet: structural point cloud decoder. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 383–392. https://doi.org/10.1109/CVPR.2019.00047
Liu M, Sheng L, Yang S, Shao J, Hu SM(2020) Morphing and sampling network for dense point cloud completion. In: Proceedings of the AAAI conference on artificial intelligence(AAAI), pp 11596–11603
Huang Z, Yu Y, Xu J, Ni F, Le X (2020) PF-Net: point fractal network for 3D point cloud completion. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7659–7667. https://doi.org/10.1109/CVPR42600.2020.00768
Wang X, Marcelo HAJ, Lee GH(2020) Cascaded refinement network for point cloud completion. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Xie H, Yao H, Zhou S, Mao J, Zhang S, Sun W(2020) GRNet: gridding residual network for dense point cloud completion. In: European conference on computer vision
Pan L, Chen X, Cai Z, Zhang J, Zhao H, Yi S, Liu Z(2021) Variational relational point completion network. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Angel C, Thomas F, Leonidas G, Pat H, Huang Q, Li Z, Silvio S, Manolis S, Song S, Su H, Xiao J, Yi L, Fisher Y(2015) ShapeNet: an information-rich 3D model repository. arXiv:1512.03012
TS Cohen MW(2016) Group equivariant convolutional networks. In: Proceedings of the international conference on machine learning (ICML)
Tatarchenko M, Richter SR, Ranftl R, Li Z, Koltun V, Brox T(2019) What do single-view 3D reconstruction networks learn? In: Proceedings of the IEEE conference on computer vision and pattern recognition
Zhang Z, Hua B, Rosen DW, Yeung S (2019) Rotation invariant convolutions for 3D point clouds deep learning. In: International conference on 3D vision (3DV)
Kim S, Park J, Han B (2020) Rotation-invariant local-to-global representation learning for 3D point cloud. Adv Neural Inf Process Syst
Li X, Li R, Chen G, Fu CW, Cohen-Or D, Heng PA (2021) A rotation-invariant framework for deep point cloud analysis. IEEE Trans Vis Comput Graph. https://doi.org/10.1109/TVCG.2021.3092570
Luo S, Li J, Guan J, Su Y, Cheng C, Peng J, Ma J (2022) Equivariant point cloud analysis via learning orientations for message passing. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
You Y, Lou Y, Shi R, Liu Q, Tai Y-W, Ma L, Wang W, Lu C (2022) Prin/sprin: on extracting point-wise rotation invariant features. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3130590
Shen W, Wei Z, Ren Q, Zhang B, Huang S, Fan J, Zhang Q (2024) Rotation-equivariant quaternion neural networks for 3d point cloud processing. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2023.3346383
Guo Y, Sohel F, Bennamoun M, Lu M, Wan J (2013) Rotational projection statistics for 3D local surface description and object recognition. Int J Comput Vis 105:63–86. https://doi.org/10.1007/s11263-013-0627-y
Rusu RB, Bradski G, Thibaux R, Hsu J (2010) Fast 3D recognition and pose using the viewpoint feature histogram. In: 2010 IEEE/RSJ international conference on intelligent robots and systems, pp 2155–2162. https://doi.org/10.1109/IROS.2010.5651280
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3D shape recognition. In: 2015 IEEE international conference on computer vision (ICCV), pp 945–953. https://doi.org/10.1109/ICCV.2015.114
Maturana D, Scherer S (2015) VoxNet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 922–928. https://doi.org/10.1109/IROS.2015.7353481
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Trans Graph. https://doi.org/10.1145/3326362
Xu M, Ding R, Zhao H, Qi X (2021) PAConv: position adaptive convolution with dynamic kernel assembling on point clouds. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Aoki Y, Goforth H, Srivatsan RA, Lucey S(2019) PointNetLK: robust efficient point cloud registration using PointNet. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7156–7165 . https://doi.org/10.1109/CVPR.2019.00733
Yi L, Zhao W, Wang H, Sung M, Guibas LJ (2019) GSPN: generative shape proposal network for 3D instance segmentation in point cloud. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 3942–3951. https://doi.org/10.1109/CVPR.2019.00407
Yang B, Wang J, Clark R, Hu Q, Wang S, Markham A, Trigoni N (2019) Learning object bounding boxes for 3D instance segmentation on point clouds. Adv Neural Inf Process Syst
Hu Q, Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni N, Markham A (2020) RandLA-Net: efficient semantic segmentation of large-scale point clouds. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Qi CR, Litany O, He K, Guibas L (2019) Deep hough voting for 3D object detection in point clouds. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 9276–9285. https://doi.org/10.1109/ICCV.2019.00937
Shi S, Wang X, Li H (2019) PointRCNN: 3D object proposal generation and detection from point cloud. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 770–779. https://doi.org/10.1109/CVPR.2019.00086
Hirschmüller H (2005) Accurate and efficient stereo processing by semi-global matching and mutual information. In: IEEE computer society conference on computer vision and pattern recognition
Yao YY, Luo Z, Li LS, Fang FT, Quan L (2018) MVSNet: depth inference for unstructured multi-view stereo. In: European conference on computer vision . https://doi.org/10.1007/978-3-030-01237-3_47
Chen R, Han S, Xu J, Su H (2019) Point-based multi-view stereo network. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 1538–1547. https://doi.org/10.1109/ICCV.2019.00162
Wei Y, Liu S, Wang Z, Lu J (2019) Conditional single-view shape generation for multi-view stereo reconstruction. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9643–9652. https://doi.org/10.1109/CVPR.2019.00988
Sarmad M, Lee HJ, Kim YM (2019) RL-GAN-Net: a reinforcement learning agent controlled GAN network for real-time point cloud shape completion. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5891–5900. https://doi.org/10.1109/CVPR.2019.00605
Pan L (2020) ECG: edge-aware point cloud completion with graph convolution. IEEE Robot Autom Lett 5(3):4392–4398. https://doi.org/10.1109/LRA.2020.2994483
Wen X, Xiang P, Han Z, Cao Y-P, Wan P, Zheng W, Liu Y-S (2021) PMP-net: point cloud completion by learning multi-step point moving paths. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7439–7448. https://doi.org/10.1109/CVPR46437.2021.00736
Xiang P, Wen X, Liu Y-S, Cao Y-P, Wan P, Zheng W, Han Z (2021) Snowflakenet: point cloud completion by snowflake point deconvolution with skip-transformer. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 5479–5489. https://doi.org/10.1109/ICCV48922.2021.00545
Lyu Z, Kong Z, Xu X, Pan L, Lin D (2022) A conditional point diffusion-refinement paradigm for 3D point cloud completion. In: International conference on learning representations (ICLR)
Gu J, Ma W-C, Manivasagam S, Zeng W, Wang Z, Xiong Y, Su H, Urtasun R (2020) Weakly-supervised 3D shape completion in the wild. In: European conference on computer vision, pp 283–299
Ma C, Chen Y, Guo P, Guo J, Wang C, Guo Y (2023) Symmetric shape-preserving autoencoder for unsupervised real scene point cloud completion. In: 2023 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 13560–13569
Lin F, Yue Y, Hou S, Yu X, Xu Y, Yamada KD, Zhang Z (2023) Hyperbolic chamfer distance for point cloud completion. In: 2023 IEEE/CVF international conference on computer vision (ICCV), pp 14549–14560
Cohen TS, Geiger M, Köhler J, Welling M (2018) Spherical CNNs. In: International conference on learning representations. arxiv:1801.10130
Esteves C, Allen-Blanchette C, Makadia A, Daniilidis K (2018) Learning SO(3) equivariant representations with spherical CNNs. In: European conference on computer vision . arxiv:1711.06721
You Y, Lou Y, Liu Q, Tai YW, Ma L, Lu C, Wang W (2020) Pointwise rotation-invariant network with adaptive sampling and 3D spherical voxel convolution. AAAI Conf Artif Intell 34:12717–12724
Khoury M, Zhou QY, Koltun V (2017) Learning compact geometric features. In: IEEE international conference on computer vision, pp 153–161
Malassiotis S, Strintzis MG (2007) Snapshots: a novel local surface descriptor and matching algorithm for robust 3D surface alignment. IEEE Trans Pattern Anal Mach Intell 29:1285–1290
Sun X, Lian Z, Xiao J (2019) Srinet: learning strictly rotation-invariant representations for point cloud classification and segmentation. In: the 27th ACM international conference on multimedia, pp 980–988
Luo S, Li J, Guan J, Su Y, Cheng C, Peng J, Ma J (2022) Equivariant point cloud analysis via learning orientations for message passing. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 18910–18919
Shen W, Wei Z, Ren Q, Zhang B, Huang S, Fan J, Zhang Q (2024) Rotation-equivariant quaternion neural networks for 3D point cloud processing. IEEE Trans Pattern Anal Mach Intell
Yu H, Hou J, Qin Z, Saleh M, Shugurov I, Wang K, Busam B, Ilic S (2024) Riga: rotation-invariant and globally-aware descriptors for point cloud registration. IEEE Trans Pattern Anal Mach Intell, pp 1–17
Song S, Lichtenberg SP, Xiao J (2015) SUN RGB-D: a RGB-D scene understanding benchmark suite. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 567–576. https://doi.org/10.1109/CVPR.2015.7298655
Andreas G, Lenz P, Christoph S, Raquel U (2013) Vision meets robotics: the kitti dataset. Int J Robot Res pp 1231–1237. https://doi.org/10.1177/0278364913491297
Wang P, Huang X, Cheng X, Zhou D, Geng Q, Yang R (2019) The apolloscape open dataset for autonomous driving and its application. IEEE Trans Pattern Anal Mach Intell
Rao Y, Lu J, Zhou J (2019) Spherical fractal convolutional neural networks for point cloud recognition. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 452–460 . https://doi.org/10.1109/CVPR.2019.00054
Chen H, Liu S, Chen W, Li H, Randall R (2021) Equivariant point network for 3D point cloud analysis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14514–14523
Chirikjian GS, Kyatkin AB (2001) Engineering applications of noncommutative harmonic analysis. CRC Press
Kostelec PJ, Rockmore DN (2007) Soft: SO(3) Fourier transforms. Department of Mathematics, Dartmouth College, Hanover, NH
Ulyanov D, Vedaldi A, Lempitsky V (2007) Instance normalization: the missing ingredient for fast stylization. arXiv:1607.08022
Zhao H, Jiang L, Jia J, Torr P, Koltun V (2021) Point transformer. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 16239–16248 . https://doi.org/10.1109/ICCV48922.2021.01595
Diederik K, Jimmy B (2014) Adam: method for stochastic optimization. In: International conference on learning representations
Wu T, Pan L, Zhang J, Wang T, Liu Z, Lin D (2021) Density-aware chamfer distance as a comprehensive metric for point cloud completion. In: Conference on neural information processing systems (NeurIPS)
Li R, Li X, Hui K-H, Fu C-W (2021) SP-GAN: sphere-guided 3D shape generation and manipulation. ACM Trans Graph 40(4):1–12
van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605
Tang J, Chen X, Wang J, Zeng G (2022) Point scene understanding via disentangled instance mesh reconstruction. In: European conference on computer vision
Funding
This work is supported by National Natural Science Foundation of China (Grant No. 51975361).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Appendices
Supplementary
Please refer to Appendix for more discussions.
Appendix 1: Full expression of DCM and Inverse DCM
DCM changes the value of \(\theta\) in point x:
where \(A = \frac{-\cos \theta _0}{\sin \theta _0^2}\), \(B=\frac{1}{\sin \theta _0}\), \(L_1 = \frac{I_1\pi }{I_3}\) , \(L_2 = \frac{I_2\pi }{I_3}\), and:
In AGG, we need to transfer points back to their original shape using inverse DCM \(\mathcal {M}^{-1}\):
where A, B, \(I_1\), \(I_2\), \(I_3\) are the same as (2) in Sect. 3.1, except for C:
(24) illustrates that DCM and its inverse form are monotonicity continuous. In practice, an alternate is to record the corresponding relations between indices of points and grid cells.
Appendix 2: Rotation equivariance of group correlation
We hereby illustrate that module \(E_1\) in Encoder is equivariant to azimuthal rotation on partial point cloud \(X_p\). Given the definition that:
where \(\mathcal {F}=\mathrm{{AGG}} \circ \mathrm{{DCM}}\). When \(X_p\) is rotated by \(\mathrm{{Rot}}_z\), without loss of generality, we illustrate the one-dimensional group correlation case (i.e., \(K=1\)) as:
A vital prerequisite of (27) is \(\mathcal {F}\) should be azimuthal rotation equivariant on \(S^2\) sphere, this is intuitive because the rotations of points induced by DCM are about the axes in \(x-y\) plane, which are all orthogonal to z-axis. In addition, the derivation process from step 2 to step 3 in (27) is a conclusion of [49], so we do not replicate the full steps here.
Appendix 3: More explanation of PVC
In this section, we provide more explanations of some steps in Algorithm 1.
Step 3 This step simply executes cyclic shift on \(H_\mathrm{{que}}\) by i bits.
Step 7 Z is a set of L2 distances \(z_i\), and V is a set of vote candidates \(v_i\).
Step 11 C is a changeable set of vote cluster centers. \(\mathrm{{max}}(\mathrm{{dist}}(V, C))\) is the unidirectional Hausdorff distance from V to C, while \(\mathrm{{argmax}}(\mathrm{{dist}}(V, C))\) is the index of \(v_i\) that is furthest to C. We set the threshold distance r as 6.
Step 18 Function \(\mathrm{{knn}}(V, C)\) means cluster all the elements in set V according to predefined centers in set C, it returns several clusters. Function \(\mathrm{{sort}}()\) sorts the clusters from the largest to the smallest.
Step 20 Function \(\mathrm{{mean}}(G)\) not only computes the mean of elements in G, but also transfers the mean of votes to angle.
Appendix 4: Computation of FLOPs
In Table 4, we compare FLOPs of different methods. For the neural network in feature remapping module, we double check the calculation accuracy using thopFootnote 4 and nni,Footnote 5 and report the larger value. For CD calculation of two point clouds \(X\in \mathbb {R}^{n\times 3}\) and \(Y\in \mathbb {R}^{n\times 3}\), it requires to compute L2 distance between two 3D points for \(n^2\) times, and each computation requires 8 FLOPs (i.e., 2 plusses, 3 minuses, and 3 multiplications).
Note that we ignore some lightweight operations here, including rotations of point clouds, L2 distance of feature vectors, dynamic vote clustering. However, all these operations are much lighter than neural network and monodirectional CD calculation and will not effect the overall comparison between different pose alignment methods.
Appendix 5: More completion results
Figure 13 illustrates more completion examples, including some asymmetric shapes. We also compare our network performances with existing methods.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, H., Miao, Y. & Fu, R. Shape completion with azimuthal rotations using spherical gidding-based invariant and equivariant network. Neural Comput & Applic 36, 13269–13292 (2024). https://doi.org/10.1007/s00521-024-09712-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-09712-z