Abstract
We propose a unified 3D flow framework for joint learning of shape embedding and deformation for different categories. Our goal is to recover shapes from imperfect point clouds by fitting the best shape template in a shape repository after deformation. Accordingly, we learn a shape embedding for template retrieval and a flow-based network for robust deformation. We note that the deformation flow can be quite different for different shape categories. Therefore, we introduce a novel multi-hub module to learn multiple modes of deformation to incorporate such variation, providing a network which can handle a wide range of objects from different categories. The shape embedding is designed to retrieve the best-fit template as the nearest neighbor in a latent space. We replace the standard fully connected layer with a tiny structure in the embedding that significantly reduces network complexity and further improves deformation quality. Experiments show the superiority of our method to existing state-of-the-art methods via qualitative and quantitative comparisons. Finally, our method provides efficient and flexible deformation that can further be used for novel shape design.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Lee, D. T.; Schachter, B. J. Two algorithms for constructing a Delaunay triangulation. International Journal of Computer & Information Sciences Vol. 9, No. 3, 219–242, 1980.
Kazhdan, M.; Hoppe, H. Screened Poisson surface reconstruction. ACM Transactions on Graphics Vol. 32, No. 3, Article No. 29, 2013.
Chang, A. X.; Funkhouser, T.; Guibas, L.; Hanrahan, P.; Huang, Q.; Li, Z.; Savarese, S.; Savva, M.; Song, S.; Su, H.; et al. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012, 2015.
Achlioptas, P.; Diamanti, O.; Mitliagkas, I.; Guibas, L. Learning representations and generative models for 3D point clouds. In: Proceedings of the 35th International Conference on Machine Learning, 40–49, 2018.
Dai, A.; Qi, C. R.; NieBner, M. Shape completion using 3D-encoder-predictor CNNs and shape synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6545–6554, 2017.
Niu, C. J.; Li, J.; Xu, K. Im2Struct: Recovering 3D shape structure from a single RGB image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4521–4529, 2018.
Yang, M. Y.; Wen, Y. X.; Chen, W. K.; Chen, Y. W.; Jia, K. Deep optimized priors for 3D shape modeling and reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3268–3277, 2021.
Uy, M. A.; Huang, J. W.; Sung, M.; Birdal, T.; Guibas, L. Deformation-aware 3D model embedding and retrieval. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12352. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 397–413, 2020.
Jiang, C.; Huang, J. W.; Tagliasacchi, A.; Guibas, L. ShapeFlow: Learnable deformation flows among 3D shapes. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 817, 9745–9757, 2020.
Guo, M. H.; Cai, J. X.; Liu, Z. N.; Mu, T. J.; Martin, R. R.; Hu, S. M. PCT: Point cloud transformer. Computational Visual Media Vol. 7, No. 2, 187–199, 2021.
Han, W. K.; Wu, H.; Wen, C. L.; Wang, C.; Li, X. BLNet: Bidirectional learning network for point clouds. Computational Visual Media Vol. 8, No. 4, 585–596, 2022.
Sorkine, O.; Alexa, M. As-rigid-as-possible surface modeling. In: Proceedings of the 5th Eurographics Symposium on Geometry Processing, 109–116, 2007.
Chen, Z. Q.; Zhang, H. Learning implicit fields for generative shape modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5932–5941, 2019.
Szegedy, C.; Liu, W.; Jia, Y. Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015.
Wang, Y. F.; Aigerman, N.; Kim, V. G.; Chaudhuri, S.; Sorkine-Hornung, O. Neural cages for detail-preserving 3D deformations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 72–80, 2020.
Sahillioglu, Y.; Yemez, Y. Coarse-to-fine combinatorial matching for dense isometric shape correspondence. Computer Graphics Forum Vol. 30, No. 5, 1461–1470, 2011.
Gao, L.; Yang, J.; Wu, T.; Yuan, Y. J.; Fu, H. B.; Lai, Y. K.; Zhang, H. SDM-NET: Deep generative network for structured deformable mesh. ACM Transactions on Graphics Vol. 38, No. 6, Article No. 243, 2019.
Yin, K. X.; Chen, Z. Q.; Huang, H.; Cohen-Or, D.; Zhang, H. LOGAN: Unpaired shape transform in latent overcomplete space. ACM Transactions on Graphics Vol. 38, No. 6, Article No. 198, 2019.
Yang, J.; Gao, L.; Lai, Y. K.; Rosin, P. L.; Xia, S. H. Biharmonic deformation transfer with automatic key point selection. Graphical Models Vol. 98, 1–13, 2018.
Zhou, K.; Xu, W. W.; Tong, Y. Y.; Desbrun, M. Deformation transfer to multi-component objects. Computer Graphics Forum Vol. 29, No. 2, 319–325, 2010.
Igarashi, T.; Moscovich, T.; Hughes, J. F. As-rigid-as-possible shape manipulation. ACM Transactions on Graphics Vol. 24, No. 3, 1134–1141, 2005.
Li, H.; Sumner, R. W.; Pauly, M. Global correspondence optimization for non-rigid registration of depth scans. In: Proceedings of the Symposium on Geometry Processing, 1421–1430, 2008.
Jack, D.; Pontes, J. K.; Sridharan, S.; Fookes, C.; Shirazi, S.; Maire, F.; Eriksson, A. Learning free-form deformations for 3D object reconstruction. In: Computer Vision - ACCV 2018. Lecture Notes in Computer Science, Vol. 11362. Jawahar, C.; Li, H.; Mori, G.; Schindler, K. Eds. Springer Cham, 317–333, 2019.
Kurenkov, A.; Ji, J. W.; Garg, A.; Mehta, V.; Gwak, J.; Choy, C.; Savarese, S. DeformNet: Free-form deformation network for 3D shape reconstruction from a single image. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 858–866, 2018.
Mehr, E.; Jourdan, A.; Thome, N.; Cord, M.; Guitteny, V. DiscoNet: Shapes learning on disconnected manifolds for 3D editing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3473–3482, 2019.
Wang, W. Y.; Ceylan, D.; Mech, R.; Neumann, U. 3DN: 3D deformation network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1038–1046, 2019.
Jiang, Z. H.; Wu, Q. Y.; Chen, K. Y.; Zhang, J. Y. Disentangled representation learning for 3D face shape. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11949–11958, 2019.
Lipman, Y.; Sorkine, O.; Cohen-Or, D.; Levin, D.; Rossl, C.; Seidel, H. P. Differential coordinates for interactive mesh editing. In: Proceedings of the Shape Modeling Applications, 181–190, 2004.
Groueix, T.; Fisher, M.; Kim, V. G.; Russell, B. C.; Aubry, M. 3D-CODED: 3D correspondences by deep deformation. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11206. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 235–251, 2018.
Wang, N. Y.; Zhang, Y. D.; Li, Z. W.; Fu, Y. W.; Liu, W.; Jiang, Y. G. Pixel2Mesh: Generating 3D mesh models from single RGB images. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11215. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 55–71, 2018.
Joshi, P.; Meyer, M.; DeRose, T.; Green, B.; Sanocki, T. Harmonic coordinates for character articulation. ACM Transactions on Graphics Vol. 26, No. 3, 71–es, 2007.
Lipman, Y.; Levin, D.; Cohen-Or, D. Green coordinates. ACM Transactions on Graphics Vol. 27, No. 3, 1–10, 2008.
Yumer, M. E.; Mitra, N. J. Learning semantic deformation flows with 3D convolutional networks. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 294–311, 2016.
Hanocka, R.; Fish, N.; Wang, Z. H.; Giryes, R.; Fleishman, S.; Cohen-Or, D. ALIGNet: Partial-shape agnostic alignment via unsupervised learning. ACM Transactions on Graphics Vol. 38, No. 1, Article No. 1, 2018.
Niemeyer, M.; Mescheder, L.; Oechsle, M.; Geiger, A. Occupancy flow: 4D reconstruction by learning particle dynamics. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5378–5388, 2019.
Ishimtsev, V.; Bokhovkin, A.; Artemov, A.; Ignatyev, S.; Nießner, M.; Zorin, D.; Burnaev, E. CAD-deform: Deformable fitting of CAD models to 3D scans. In: Computer Vision - ECCV 2020. Lecture Notes in Computer Science, Vol. 12358. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 599–628, 2020.
Chen, R. T. Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D. Neural ordinary differential equations. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 6572–6583, 2018.
Yang, G. D.; Huang, X.; Hao, Z. K.; Liu, M. Y.; Belongie, S.; Hariharan, B. PointFlow: 3D point cloud generation with continuous normalizing flows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4540–4549, 2019.
Oord, A.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.; Kavukcuoglu, K. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.
Van den Oord, A.; Kalchbrenner, N.; Vinyals, O.; Espeholt, L.; Graves, A.; Kavukcuoglu, K. Conditional image generation with PixelCNN decoders. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 4797–4805, 2016.
Kingma, D. P.; Salimans, T.; Jozefowicz, R.; Chen, X.; Sutskever, I.; Welling, M. Improved variational inference with inverse autoregressive flow. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 4743–4751, 2016.
Dinh, L.; Sohl-Dickstein, J.; Bengio, S. Density estimation using real NVP. arXiv preprint arXiv: 1605.08803, 2016.
Papamakarios, G.; Pavlakou, T.; Murray, I. Masked autoregressive flow for density estimation. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 2335–2344, 2017.
Huang, C. W.; Krueger, D.; Lacoste, A.; Courville, A. Neural autoregressive flows. In: Proceedings of the 35th International Conference on Machine Learning, 2078–2087, 2018.
De Cao, N.; Aziz, W.; Titov, I. Block neural autoregressive flow. In Proceedings of the 35th Uncertainty in Artificial Intelligence Conference, 1263–1273, 2020.
Rezende, D. J.; Mohamed, S. Variational inference with normalizing flows. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, Vol. 37, 1530–1538, 2015.
Van Den Berg, R.; Hasenclever, L.; Tomczak, J. M.; Welling, M. Sylvester normalizing flows for variational inference. In: Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence, 393–402, 2018.
Grathwohl, W.; Chen, R. T.; Bettencourt, J.; Sutskever, I.; Duvenaud, D. Ffjord: Free-form continuous dynamics for scalable reversible generative models. arXiv preprint arXiv:1810.01367, 2018.
Tatarchenko, M.; Richter, S. R.; Ranftl, R.; Li, Z. W.; Koltun, V.; Brox, T. What do single-view 3D reconstruction networks learn? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3400–3409, 2019.
Nan, L. L.; Xie, K.; Sharf, A. A search-classify approach for cluttered indoor scene understanding. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 137, 2012.
Li, Y. Y.; Su, H.; Qi, C. R.; Fish, N.; Cohen-Or, D.; Guibas, L. J. Joint embeddings of shapes and images via CNN image purification. ACM Transactions on Graphics Vol. 34, No. 6, Article No. 234, 2015.
Tabia, H.; Laga, H. Learning shape retrieval from different modalities. Neurocomputing Vol. 253, 24–33, 2017.
Wu, Z. Z.; Zhang, Y. H.; Zeng, M.; Qin, F. W.; Wang, Y. G. Joint analysis of shapes and images via deep domain adaptation. Computers & Graphics Vol. 70, 140–147, 2018.
Lee, T.; Lin, Y. L.; Chiang, H.; Chiu, M. W.; Hsu, W.; Huang, P. Cross-domain image-based 3D shape retrieval by view sequence learning. In: Proceedings of the International Conference on 3D Vision, 258–266, 2018.
Jin, A. B.; Fu, Q.; Deng, Z. G. Contour-based 3D modeling through joint embedding of shapes and contours. In: Proceedings of the Symposium on Interactive 3D Graphics and Games, Article No. 9, 2020.
Chen, M. J.; Wang, C. B.; Liu, L. G. Cross-domain retrieving sketch and shape using cycle CNNs. Computers & Graphics Vol. 89, 50–58, 2020.
Dahnert, M.; Dai, A.; Guibas, L.; Niessner, M. Joint embedding of 3D scan and CAD objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8748–8757, 2019.
Wu, Z. J.; Wang, X.; Lin, D.; Lischinski, D.; Cohen-Or, D.; Huang, H. SAGNet: Structure-aware generative network for 3D-shape modeling. ACM Transactions on Graphics Vol. 38, No. 4, Article No. 91, 2019.
Liu, M. M.; Zhang, K. X.; Zhu, J.; Wang, J.; Guo, J.; Guo, Y. W. Data-driven indoor scene modeling from a single color image with iterative object segmentation and model retrieval. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 4, 1702–1715, 2020.
Kuo, W. C.; Angelova, A.; Lin, T. Y.; Dai, A. Mask2CAD: 3D shape prediction by learning to segment and retrieve. In: Computer Vision - ECCV 2020. Lecture Notes in Computer Science, Vol. 12348. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 260–277, 2020.
Choy, C. B.; Xu, D. F.; Gwak, J.; Chen, K.; Savarese, S. 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9912. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 628–644, 2016.
Pan, J. Y.; Han, X. G.; Chen, W. K.; Tang, J. P.; Jia, K. Deep mesh reconstruction from single RGB images via topology modification networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9963–9972, 2019.
Chen, Z. Q.; Tagliasacchi, A.; Zhang, H. BSP-net: Generating compact meshes via binary space partitioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 42–51, 2020.
Groueix, T.; Fisher, M.; Kim, V. G.; Russell, B. C.; Aubry, M. Unsupervised cycle-consistent deformation for shape matching. Computer Graphics Forum Vol. 38, No. 5, 123–133, 2019.
Acknowledgements
This work was supported by the National Key R&D Program of China (2020YFB1708900) and the National Natural Science Foundation of China (62072271).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Baiqiang Leng received his B.Eng. degree in computer science from Zhejiang University, China, in 2019 and his master degree from Tsinghua University, China, in 2022. His research interests include computer vision, deep learning, and 3D shape deformation.
Jingwei Huang is a staff research scientist and project leader at Riemann Lab, 2012 Laboratories, Huawei. He received his Ph.D. degree in computer science in 2020 from Stanford University, USA. His research interests lie at the interface of 3D computer vision, computer graphics, and geometry processing. His passion is harnessing the power of modern deep learning and traditional geometry processing algorithms to automate 3D holistic reconstruction of large-scale real environments.
Guanlin Shen received his B.Eng. degree in computer science from Tsinghua University in 2021. He is currently working towards a D.Eng. degree in the School of Software, Tsinghua University. His research interests include computer vision, deep learning, and 3D object detection.
Bin Wang received his B.Sc. degree in chemistry in 1999, and his Ph.D. degree in computer science from Tsinghua University in 2005. He is currently an associate professor in the School of Software, Tsinghua University. He was a research assistant at the Department of Computer Science, Hong Kong University, and had postdoctoral research training in the ISA/ALICE Research Group, INRIA-LORIA, France. His research interests include computer graphics and computer vision.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Leng, B., Huang, J., Shen, G. et al. Shape embedding and retrieval in multi-flow deformation. Comp. Visual Media 10, 439–451 (2024). https://doi.org/10.1007/s41095-022-0315-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-022-0315-3