More Web Proxy on the site http://driver.im/

research-article

Open access

Generative Multi-View Based 3D Human Pose Estimation

Author:

Motaz SabriAuthors Info & Claims

SIET '21: Proceedings of the 6th International Conference on Sustainable Information Engineering and Technology

Pages 2 - 9

https://doi.org/10.1145/3479645.3479708

Published: 03 November 2021 Publication History

All formats PDF

Abstract

Large amounts of annotated data is essential for modern Human pose estimation. We propose using a semi supervised learning scheme to estimate the 3D poses from adversarial multi-views generated representations from a single RGB image. Our GAN generated views are the result of training that aims to create authentic and less degenerated outputs. Our method targets the shared latent space between the 3 dimensional and 2 dimensional poses and aims to simplify it by constraining the latent distribution. This resulted in a noticeable increase in the method generalization and exploitation of unlabeled depth maps. We utilized heatmaps to visualize the attention robustness under variety of poses. Our method competes with state of the art performances among semi supervised approaches and excels in some challenging poses as evaluated on Human3.6M, MPII-INF-3DHP and Leeds SportsPose challenging datasets. 1

References

[1]

M. Arjovsky, S. Chintala, and L. Bottou. 2017. Wasserstein generative adversarial networks. In In International Conference on Machine Learning.

[2]

V. Belagiannis and A. Zisserman. 2017. Recurrent Human Pose Estimation. In In International Conference on Automatic Faceand Gesture Recognition.

[3]

A. Bulat, J Kossaifi, G Tzimiropoulos, and M Pantic. 2020. Toward fast and accurate human pose estimation via soft-gated skip connections. In In International Conference on Automatic Face and Gesture Recognition.

Digital Library

[4]

C. Chen, A. Tyagi, A. Agrawal, Dylan Drover, Rohith MV, Stefan Stojanov, and James M.Rehg. 2019. Unsupervised 3d pose estimation with geometric self-supervision. In In Conference on Computer Vision and Pattern Recognition.

[5]

Y. Chen, C. Shen, X. Wei, L. Liu, and J. Yang. 2017. Adversarial PoseNet: A Structure-Aware Convolutional Network for Human Pose Estimation. In IEEE International Conference on Computer Vision.

[6]

Y. Cheng, B. Yang, B. Wang, and R. T. Tan. 2020. 3D Human Pose Estimation Using Spatio-Temporal Networks with Explicit Occlusion Training. In In Proceedings of the AAAI Conference on Artificial Intelligence.

[7]

C. Chou, J. Chien, and H. Chen. 2018. Self Adversarial Training for Human Pose Estimation. In In Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.

[8]

X. Chu, W. Yang, W. Ouyang, C. Ma, A. L. Yuille, and X. Wang. 2017. Multi-context Attention for Human Pose Estimation. In In Conference on Computer Vision and Pattern Recognition.

[9]

P. Diederik, Kingma, and Max Welling. 2014. Auto-Encoding Variational Bayes. In International Conference on Learning Representations.

[10]

Y. Du, Y. Wong, Y. Liu, F. Han, Y. Gui, Z. Wang, M. Kankanhalli, and W. Geng. 2016. Marker-less 3D human motion capture with monocular image sequence and height-maps. In In European Conference on Computer Vision.

[11]

D. Dylan, R. MV, C. Chen, A. Agrawal, A. Tyagi, and P. Cong. 2018. Can 3D Pose be Learned from 2D Projections Alone?. In Proceedings of the European Conference on Computer Vision Workshops.

[12]

B. Email and T. zimiropoulos. 2016. Human Pose Estimation via Convolutional Part Heatmap Regression. In In European Conference on Computer Vision.

[13]

H. Fang, Y. Xu, W. Wang, X. Liu, and S. Zhu.2018. Learning pose grammar to encode human body configuration for 3d pose estimation. In In AAAI Conference on Artificial Intelligence.

[14]

I. Goodfellow, J. Pouget-Abadie, M. Mirza, D. Warde-Farley B. Xu, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative adversarial nets. In In International Conference on Neural Information Processing Systems.

[15]

D. Metaxas H. Zhang, I. Goodfellow and A. Odena. 2019. Self-attention generative adversarial networks. In In International Conference on Machine Learning.

[16]

I. Habibie, W. Xu, D. Mehta, G. Pons-Moll t, and C. Theobalt. 2019. In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations. In In Conference on Computer Vision and Pattern Recognition.

[17]

M. Rayat Imtiaz Hossain and J. J. Little. 2018. Exploiting temporal information for 3d human pose estimation. In In European Conference on Computer Vision.

[18]

I. Insafutdinov and E. Eldar. 2016. DeeperCut: A Deeper and Stronger and and Faster Multi-person Pose Estimation Model. In In European Conference on Computer Vision.

[19]

C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu. 2014. Human3.6m: Large scale datasets and predictive methods for 3dhuman sensing in natural environments. In In Transactions on Pattern Analysis and Machine Intelligence.

[20]

C. Ionescu, O. Vantzos, and C. Sminchisescu. 2015. Matrix Backpropagation for Deep Networks with Structured Layers. In In International Conference on Computer Vision.

[21]

S. Johnson and M. Everingham. 2010. Clustered pose and nonlinear appearance models for human pose estimation. In In British Machine Vision Conference.

[22]

H. Joo. 2015. Panoptic Studio: A Massively Multiview System for Social Motion Capture. In In International Conference on Computer Vision.

[23]

A. Kadkhodamohammadi and N. Padoy. 2020. A generalizable approach for multi-view 3D human pose regression. In CoRR, abs/1804.10462, 2018.

[24]

A. Kanazawa, M. J. Black, D. W. Jacobs, and J. Malik. 2018. End-to-end recovery of human shape and pose. In In Conference on Computer Vision and Pattern Recognition.

[25]

I. Karim, B. Egor L. Victor, and M. Yury. 2019. Learnable triangulation of human pose. In In International Conference on Computer Vision.

[26]

K. Kingma, Ba, and J. 2015. Adam: A Method for Stochastic Optimization. In In International Conference on Learning Representations.

[27]

K. Knippenberg, J. Verbrugghe, I. Lamers, S. Palmaers, A. Timmermans, and A. Spooren. 2017. Markerless motion capture systems as training device in neurological rehabilitation: A systematic review of their use and application and target population and efficacy. In In journal of NeuroEngineering and Rehabilitation.

[28]

M. Kocabas, S. Karagoz, and E. Akbas. 2019. Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry. In In Conference on Computer Vision and Pattern Recognition.

[29]

Li K. Jiang S. Zhang Z. Huang C. Xu R. Y. D. Li, Y.2020. Geometry-Driven Self-Supervised Method for 3D Human Pose Estimation. In Proceedings of the AAAI Conference on Artificial Intelligence.

[30]

L. Lifshitz, Fetaya, and Ullman. 2016. Human Pose Estimation Using Deep Consensus Voting. In In European Conference on Computer Vision.

[31]

K. Lipeng, C. Ming-Ching, Q. Honggang, and L. Siwei. 2018. Multi-Scale Structure-Aware Network for Human Pose Estimation. In European Conference on Computer Vision.

[32]

R. Liu, J. Shen, H. Wang, C. Chen, S. Cheung, and V. Asari. 2020. Attention mechanism exploits temporal contexts: Real-time 3d human pose reconstruction. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[33]

A. Llopart. 2020. LiftFormer: 3D Human Pose Estimation using attention models. arXiv preprint arXiv:2103.10455.

[34]

C. Luo, X. Chu, and A. L. Yuille. 2018. Orinet: A fully convolutional network for 3d human pose estimation. In In British Machine Vision Conference.

[35]

J. Martinez, R. Hossain, J. Romero, and J. J. Little. 2017. A simple yet effective baseline for 3d human pose estimation. In In International Conference on Computer Vision.

[36]

D. Mehta, H. Rhodin, D. Casas, P. Fua, W. Xu O. Sotnychenko, and C. 2017. Theobalt. Monocular 3d human pose estimation in the wild using improved CNN supervision. In In 3D Vision.

[37]

D. Mehta, S. Sridhar, O. Sotnychenko, H. Rhodin, M. Shafiei, H.-P. Seidel, W. Xu, D. Casas, and C. Theobalt. 2017. Vnect: Real-time 3d human pose estimation with a single RGB camera. In In Transactions on Graphics.

Digital Library

[38]

G. Moon, J. Y. Chang, and K. M. Lee. 2019. Camera Distance-Aware Top-Down Approach for 3D Multi-Person Pose Estimation From a Single RGB Image. In In International Conference on Computer Vision.

[39]

F. MorenoNoguer. 2017. 3d human pose estimation from a single image via distance matrix regression. In In the Conference on Computer Vision and Pattern Recognition.

[40]

G. Ning, Z. Zhang, and Z. He. 2018. Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation. In In Transactions on Multimedia.

[41]

D. Novotny, N. Ravi, B. Graham, N. Neverova, and A. Vedaldi. 2019. C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion. In In International Conference on Computer Vision.

[42]

S. Park, J. Hwang, and N. Kwak. 2016. 3d human pose estimation using convolutional neural networks with 2d pose information. In In European Conference on Computer Vision Workshops.

[43]

G. Pavlakos, X. Zhou, K. G. Derpanis, and K. Daniilidis. 2017. Coarse-to-fine volumetric prediction for single-image 3d human pose. In In Conference on Computer Vision and Pattern Recognition.

[44]

G. Pavlakos, X. Zhou, K. G. Derpanis, and K. Daniilidis. 2017. Harvesting Multiple Views for Marker-Less 3D Human Pose Annotations. In In Conference on Computer Vision and Pattern Recognition.

[45]

D. Pavllo, C. Feichtenhofer, D. Grangier, and M. Auli. 2019. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In In Conference on Computer Vision and Pattern Recognition.

[46]

A. Pirinen, E. Gartner, and C. Sminchisescu. 2019. Domes to Drones: Self-Supervised Active Triangulation for 3D Human Pose Reconstruction. In Advances in Neural Information Processing Systems.

[47]

L. Pishchulin. 2016. DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation. In In Conference on Computer Vision and Pattern Recognition.

[48]

A. Popa, M. Zanfir, and C. Sminchisescu. 2017. Deep Multi-task Architecture for Integrated 2D and 3D Human Sensing. In In Conference on Computer Vision and Pattern Recognition.

[49]

R. Rafi, Leibe, Gall, and Kostrikov. 2016. An Efficient Convolutional Network for Human Pose Estimation. In In British Machine Vision Conference.

[50]

H. Rhodin, V. Constantin, I. Katircioglu, M. Salzmann, and P. Fua. 2019. Neural Scene Decomposition for Multi-Person Motion Capture. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]

H. Rhodin, M. Salzmann, and P. Fua. 2018. Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation. In Proceedings of the European Conference on Computer Vision.

[52]

G. Rogez and Cordelia Schmid. 2016. MoCap-guided data augmentation for 3D pose estimation in the wild. In In International Conference on Neural Information Processing Systems.

[53]

G. Rogez, P. Weinzaepfel, and C. Schmid. 2017. LCR-Net: Localization-Classification-Regression for Human Pose. In In Conference on Computer Vision and Pattern Recognition.

[54]

M. Sabri. 2020. Multi-view Generative Networks for 3D Pose Estimation. In International Conference on Knowledge and Innovation in Engineering, Science and Technology.

[55]

E. Simo-Serra, A. Ramisa, G. Aleny, C. Torras, and F. MorenoNoguer. 2012. Single image 3d human pose estimation from noisy observations. In In Conference on Computer Vision and Pattern Recognition.

[56]

K. Sun, B. Xiao, D. Liu, and J. Wang. 2018. Deeply Learned Compositional Models for Human Pose Estimation. In European Conference on Computer Vision.

[57]

X. Sun, J. Shang, S. Liang, and Y. Wei. 2017. Compositional Human Pose Regression. In In International Conference on Computer Vision.

[58]

C. Szegedy. 2015. Going deeper with convolutions. In In Conference on Computer Vision and Pattern Recognition.

[59]

B. Tekin, P. Marquez-Neila, M. Salzmann, and P. Fua. 2017. Learning to Fuse 2D and 3D Image Cues for Monocular Body Pose Estimation. In In International Conference on Computer Vision.

[60]

D. Tome, C. Russell, and L. Agapito. 2018. Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image. In In International Conference on Biomedical Engineering and Bioinformatics.

[61]

C. Wan, T. Probst, L. Van Gool, and A. Yao. 2017. Crossing Nets: Combining GANs and VAEs with a Shared Latent Space for Hand Pose Estimation. In In Conference on Computer Vision and Pattern Recognition.

[62]

B. Wandt, H. Ackermann, and B. Rosenhahn. 2018. A kinematic chain space for monocular motion capture. In In European Conference on Computer Vision.

[63]

B. Wandt and Bodo Rosenhahn. 2019. Repnet: Weakly supervised training of an adversarial reprojection network for 3d human pose estimation. In In Conference on Computer Vision and Pattern Recognition.

[64]

S. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh. 2016. Convolutional Pose Machines. In In Conference on Computer Vision and Pattern Recognition.

[65]

T. Wei, Y. Pei, and W. Ying. 2018. Deeply Learned Compositional Models for Human Pose Estimation. In European Conference on Computer Vision.

[66]

J. Wu, T. Xue, J. J. Lim, Y. Tian, J. B. Tenenbaum, A. Torralba, and W. T. Freeman. 2020. Deep kinematics analysis for monocular 3d human pose estimation. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[67]

J. Liu J. Cai T. Cham Y. Cai, L. Ge. 2019. Exploiting spatial-temporal relationships for 3d pose estimation via graph convolutional networks. In In Proceedings of the IEEE International Conference on Computer Vision.

[68]

W. Yang, S. Li, W. Ouyang, H. Li, and X. Wang. 2017. Learning Feature Pyramids for Human Pose Estimation. In In International Conference on Computer Vision.

[69]

W. Yang, W. Ouyang, H. Li, and X. Wang. 2016. End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation. In In Conference on Computer Vision and Pattern Recognition.

[70]

C. Yu, Y. Bo, W. Bo, W. Yan, and T. Robby. 2019. Occlusion-aware networks for 3d human pose estimation in video. In In International Conference on Computer Vision.

[71]

X. Yu, F. Zhou, and M. Chandraker. 2016. Deep deformation network for object landmark localization. In In European Conference on Computer Vision.

[72]

A. Zeng, X. Sun, F. Huangand M. Liu, Q. Xu, and S. Lin. 2020. Srnet: Improving generalization in 3d human pose estimation with a split-and-recombine approach. In In European Conference on Computer Vision.

[73]

L. Zhao, X. Peng, Y. Tian, M. Kapadia, and D.Metaxas. 2019. Semantic graph convolutional networks for 3d human pose regression. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[74]

X. Zhou, Q. Huang, X. Sun, X. Xue, and Y. We. 2017. Weakly-Supervised Transfer for 3D Human Pose Estimation in the Wild. In In International Conference on Computer Vision.

[75]

X. Zhou, X. Sun, W. Zhang, S. Liang, and Y. Wei. 2016. Deep kinematic pose regression. In In European Conference on Computer Vision.

Recommendations

Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation
Computer Vision – ECCV 2018
Abstract
Modern 3D human pose estimation techniques rely on deep networks, which require large amounts of training data. While weakly-supervised methods require less supervision, by utilizing 2D poses or multi-view imagery without annotations, they still ...
Unsupervised multi-view stereo network based on multi-stage depth estimation
Abstract
In current years, supervised learning multi-view stereo (MVS) methods have achieved impressive performance. However, these methods still suffer the limitation of hard to acquire large-scale depth supervision data, which hinders the ...
Highlights
- We propose a novel unsupervised MVS network for dense 3D reconstruction.
- A ...
Semi-supervised Variational Multi-view Anomaly Detection
Web and Big Data
Abstract
Multi-view anomaly detection (Multi-view AD) is a challenging problem due to the inconsistent behaviors across multiple views. Meanwhile, learning useful representations with little or no supervision has attracted much attention in machine ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

SIET '21: Proceedings of the 6th International Conference on Sustainable Information Engineering and Technology

September 2021

354 pages

ISBN:9781450384070

DOI:10.1145/3479645

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SIET '21

SIET '21: 6th International Conference on Sustainable Information Engineering and Technology 2021

September 13 - 14, 2021

Malang, Indonesia

Acceptance Rates

Overall Acceptance Rate 45 of 57 submissions, 79%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
1,145
Total Downloads

Downloads (Last 12 months)344
Downloads (Last 6 weeks)29

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents