[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Self‐supervised non‐rigid structure from motion with improved training of Wasserstein GANs

Published: 06 February 2023 Publication History

Abstract

This study proposes a self‐supervised method to reconstruct 3D limbic structures from 2D landmarks extracted from a single view. The loss of self‐consistency can be reduced by performing a random orthogonal projection of the reconstructed 3D structure. Thus, the training process can be self‐supervised by using geometric self‐consistency in the reconstruction–projection–reconstruction process. The self‐supervised network mainly consists of graph convolution and Transformer encoders. This network is called the SS‐Graphformer. By adding a discriminator, the SS‐Graphformer is used as a generator to form a Wasserstein Generative Adversarial Network architecture with a Gradient Penalty to improve the accuracy of the reconstruction. It is experimentally demonstrated that the addition of the 2D structure discriminator can significantly improve the accuracy of the reconstruction.

Graphical Abstract

We present SS‐Graphformer, a graph convolution and Transformer‐based method for 3D structure reconstruction from 2D landmarks. In addition, geometric self‐consistency is used to achieve self‐supervision; when combined with the 2D structure discriminator, the accuracy of the reconstruction can be improved. Extensive experiments show that our model achieves state‐of‐the‐art performance on two popular data sets.

References

[1]
Bregler, C., Hertzmann, A., Biermann, H.: Recovering non‐rigid 3D shape from image streams. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 690–696. IEEE (2000)
[2]
Novotny, D., et al.: C3DPO: canonical 3D pose networks for non‐rigid structure from motion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7688–7697. (2019)
[3]
Zeng, H., et al.: PR‐RRN: pairwise‐regularized residual‐recursive networks for non‐rigid structure‐from‐motion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5600–5609. (2021)
[4]
Akhter, I., et al.: Nonrigid structure from motion in trajectory space. Adv. Neural Inf. Process. Syst. 21, 41–48 (2008)
[5]
Dai, Y., Li, H., He, M.: A simple prior‐free method for non‐rigid structure‐from‐motion factorization. Int. J. Comput. Vis. 107(2), 101–122 (2014). https://doi.org/10.1007/s11263-013-0684-2
[6]
Kumar, S.: Non‐rigid structure from motion: prior‐free factorization method revisited. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 51–60. (2020)
[7]
Akhter, I., Sheikh, Y., Khan, S.: Defense of orthonormality constraints for nonrigid structure from motion. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1534–1541. IEEE (2009)
[8]
Kong, C., Lucey, S.: Deep non‐rigid structure from motion with missing data. IEEE Trans. Pattern Anal. Mach. Intell. 43(12), 4365–4377 (2020). https://doi.org/10.1109/tpami.2020.2997026
[9]
Park, S., Lee, M., Kwak, N.: Procrustean regression: a flexible alignment‐based framework for nonrigid structure estimation. IEEE Trans. Image Process. 27(1), 249–264 (2017). https://doi.org/10.1109/tip.2017.2757280
[10]
Park, S., Lee, M., Kwak, N.: Procrustean regression networks: learning 3D structure of non‐rigid objects from 2D annotations. In: European Conference on Computer Vision, pp. 1–18. Springer (2020)
[11]
Wang, C., Paul, L.S.: Procrustean autoencoder for unsupervised lifting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 434–443 (2021)
[12]
Sidhu, V., et al.: Neural dense non‐rigid structure from motion with latent space constraints. In: European Conference on Computer Vision, pp. 204–222. Springer (2020)
[13]
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton‐based action recognition. In: Thirty‐Second AAAI Conference on Artificial Intelligence (2018)
[14]
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 6000–6010 (2017)
[15]
Bozic, A., et al.: Transformerfusion: monocular RGB scene reconstruction using transformers. Adv. Neural Inf. Process. Syst. 34, 1403–1414(2021)
[16]
Zheng, C., et al.: 3D human pose estimation with spatial and temporal transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11656–11665 (2021)
[17]
Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, 2672–2680 (2014)
[18]
Hoffman, J., et al.: CyCADA: cycle‐consistent adversarial domain adaptation. In: International Conference on Machine Learning, pp. 1989–1998. Pmlr (2018)
[19]
Zhu, J.Y., et al.: Unpaired image‐to‐image translation using cycle‐consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232. (2017)
[20]
Drover, D., et al.: Can 3D pose be learned from 2D projections alone? In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, p. 0. (2018)
[21]
Kudo, Y., et al.: Unsupervised adversarial learning of 3D human pose from 2D joint locations. arXiv preprint arXiv:1803.08244 (2018)
[22]
Wandt, B., Rosenhahn, B.: RepNet: weakly supervised training of an adversarial reprojection network for 3D human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7782–7791. (2019)
[23]
Chen, Y., et al.: Adversarial PoseNet: a structure‐aware convolutional network for human pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1212–1221. (2017)
[24]
Fish Tung, H.Y., et al.: Adversarial inverse graphics networks: learning 2D‐to‐3D lifting and image‐to‐image translation from unpaired supervision. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4354–4362. (2017)
[25]
Kanazawa, A., et al.: End‐to‐end recovery of human shape and pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7122–7131. (2018)
[26]
Chen, C.H., et al.: Unsupervised 3D pose estimation with geometric self‐supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5714–5724. (2019)
[27]
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
[28]
Gulrajani, I., et al.: Improved training of Wasserstein GANs. Adv. Neural Inf. Process. Syst. 30, 5769–5779 (2017)
[29]
Akhter, I., et al.: Trajectory space: a dual representation for nonrigid structure from motion. IEEE Trans. Pattern Anal. Mach. Intell. 33(7), 1442–1456 (2010). https://doi.org/10.1109/tpami.2010.201
[30]
Kingma, D.P., Jimmy, B.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
[31]
Lee, M., et al.: Procrustean normal distribution for non‐rigid structure from motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1280–1287. (2013)
[32]
Deng, H., et al.: Deep non‐rigid structure‐from‐motion: a sequence‐to‐sequence translation perspective. arXiv preprint arXiv:2204.04730 (2022)
[33]
Wang, C., Lin, C.H., Lucey, S.: Deep NRSfM++: towards unsupervised 2D‐3D lifting in the wild. In: 2020 International Conference on 3D Vision (3DV), pp. 12–22. IEEE (2020)

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IET Computer Vision
IET Computer Vision  Volume 17, Issue 4
June 2023
129 pages
EISSN:1751-9640
DOI:10.1049/cvi2.v17.4
Issue’s Table of Contents
This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 06 February 2023

Author Tags

  1. computer vision
  2. neural nets

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media