[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Human pose estimation via multi-layer composite models

Published: 01 May 2015 Publication History

Abstract

We introduce a hierarchical part-based approach for human pose estimation in static images. Our model is a multi-layer composite of tree-structured pictorial-structure models, each modeling human pose at a different scale and with a different graphical structure. At the highest level, the submodel acts as a person detector, while at the lowest level, the body is decomposed into a collection of many local parts. Edges between adjacent layers of the composite model encode cross-model constraints. This multi-layer composite model is able to relax the independence assumptions in tree-structured pictorial-structures models (which can create problems like double-counting image evidence), while still permitting efficient inference using dual-decomposition. We propose an optimization procedure for joint learning of the entire composite model. Our approach outperforms the state-of-the-art on four challenging datasets: Parse, UIUC Sport, Leeds Sport Pose and FLIC datasets. HighlightsWe propose a novel multi-layer graphical model for human pose estimation.We conduct a broad and detailed literature review on hierarchical models, multi-scale models, and mixture models for human pose estimation.We propose a novel design of dual decomposition approach for efficient inference applied on the proposed multi-layer composite model.We propose a joint discriminative learning framework using structured SVM.We performed systematic evaluation on three challenging state-of-art pose estimation datasets using our model against baseline approaches.

References

[1]
M. Andriluka, S. Roth, B. Schiele, Pictorial structures revisited: people detection and articulated pose estimation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2009.
[2]
D.P. Bertsekas, Nonlinear Programming, Athena Scientific, Nashua, 1999.
[3]
H. Cheng, Z. Liu, L. Yang, X. Chen, Sparse representation and learning in visual recognition, Signal Processing, 93 (2013) 1408-1425.
[4]
D. Crandall, P. Felzenszwalb, D. Huttenlocher, Spatial priors for part-based recognition using statistical models, in: IEEE Conference on Computer Vision and Pattern Recognition, 2005.
[5]
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: IEEE Conference on Computer Vision and Pattern Recognition, 2005.
[6]
M. Dantone, J. Gall, C. Leistner, L.J.V. Gool, Human pose estimation using body parts dependent joint regressors, in: IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 3041-3048.
[7]
K. Duan, D. Batra, D.J. Crandall, A multi-layer composite model for human pose estimation, in: British Machine Vision Conference, 2012.
[8]
M. Eichner, M. Marin-Jimenez, A. Zisserman, V. Ferrari, Articulated Human Pose Estimation and Search in (Almost) Unconstrained Still Images. Technical Report, ETH Zurich, 2010.
[9]
P.F. Felzenszwalb, R.B. Girshick, D. McAllester, D. Ramanan, Object detection with discriminatively trained part-based models, IEEE Trans. Pattern Anal. Mach. Intell., 32 (2010) 1627-1645.
[10]
P.F. Felzenszwalb, D. Huttenlocher, Pictorial structures for object recognition, Int. J. Comput. Vis., 61 (2005) 55-79.
[11]
K. Hara, R. Chellappa, Computationally efficient regression on a dependency graph for human pose estimation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 3390-3397.
[12]
S. Johnson, M. Everingham, Clustered pose and nonlinear appearance models for human pose estimation, in: British Machine Vision Conference, 2010.
[13]
S. Johnson, M. Everingham, Learning effective human pose estimation from inaccurate annotation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[14]
N. Komodakis, N. Paragios, G. Tziritas, MRF energy minimization and beyond via dual decomposition, IEEE Trans. Pattern Anal. Mach. Intell., 33 (2011) 531-552.
[15]
X. Lan, D. Huttenlocher, Beyond trees: common-factor models for 2d human pose recovery, in: International Conference on Computer Vision, 2005.
[16]
D. Park, D. Ramanan, C. Fowlkes, Multiresolution models for object detection, in: European Conference in Computer Vision, 2010.
[17]
L. Pishchulin, A. Jain, M. Andriluka, T. Thormaehlen, B. Schiele, Articulated people detection and pose estimation: reshaping the future, in: Computer Vision and Pattern Recognition, 2012.
[18]
B.T. Polyak, A general method for solving extremum problems, Sov. Math., 8 (1967).
[19]
D. Ramanan, Learning to parse images of articulated bodies, in: Neural and Information Processing Systems, 2006.
[20]
D. Ramanan, D. Forsyth, A. Zisserman, Tracking people by learning their appearance, IEEE Trans. Pattern Anal. Mach. Intell., 29 (2007) 65-81.
[21]
B. Sapp, B. Taskar, Modec: multimodal decomposable models for human pose estimation, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, IEEE, 2013, pp. 3674-3681.
[22]
B. Sapp, A. Toshev, B. Taskar, Cascaded models for articulated pose estimation, in: European Conference in Computer Vision, 2010.
[23]
B. Sapp, D. Weiss, B. Taskar, Parsing human motion with stretchable models, in: IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[24]
J. Shotton, A.W. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, A. Blake, Real-time human pose recognition in parts from single depth images, in: IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 1297-1304.
[25]
V.K. Singh, R. Nevatia, C. Huang, Efficient inference with multiple heterogeneous part detectors for human pose estimation, in: European Conference in Computer Vision, 2010.
[26]
D. Tran, D. Forsyth, Improved human parsing with a full relational model, in: European Conference on Computer Vision, 2010.
[27]
I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun, Large margin methods for structured and interdependent output variables, J. Mach. Learn. Res., 6 (2005) 1453-1484.
[28]
H. Wang, D. Koller, Multi-level inference by relaxed dual decomposition for human pose segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[29]
Y. Wang, G. Mori, Multiple tree models for occlusion and spatial constraints in human pose estimation, in: European Conference in Computer Vision, 2008.
[30]
Y. Wang, D. Tran, Z. Liao, Learning hierarchical poselets for human parsing, in: IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[31]
Y. Yang, D. Ramanan, Articulated pose estimation with flexible mixtures-of-parts, in: IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[32]
Y. Yang, D. Ramanan, Articulated human detection with flexible mixtures of parts, IEEE Trans. Pattern Anal. Mach. Intell., 35 (2013) 2878-2890.
[33]
J. Yu, M. Wang, D. Tao, Semisupervised multiview distance metric learning for cartoon synthesis, IEEE Trans. Image Process., 21 (2012) 4636-4648.
[34]
H. Zhou, A.M. Wallace, P.R. Green, Efficient tracking and ego-motion recovery using gait analysis, Signal Process., 89 (2009) 2367-2384.
[35]
L. Zhu, Y. Chen, Y. Lu, C. Lin, A.L. Yuille, Max margin AND/OR graph learning for parsing the human body, in: IEEE Conference on Computer Vision and Pattern Recognition, 2008.

Cited By

View all
  • (2019)E-discover State-of-the-art Research Trends of Deep Learning for Computer Vision2019 IEEE International Conference on Systems, Man and Cybernetics (SMC)10.1109/SMC.2019.8914555(1360-1365)Online publication date: 6-Oct-2019
  1. Human pose estimation via multi-layer composite models

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Signal Processing
    Signal Processing  Volume 110, Issue C
    May 2015
    263 pages

    Publisher

    Elsevier North-Holland, Inc.

    United States

    Publication History

    Published: 01 May 2015

    Author Tags

    1. Human pose estimation
    2. Object detection

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 31 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)E-discover State-of-the-art Research Trends of Deep Learning for Computer Vision2019 IEEE International Conference on Systems, Man and Cybernetics (SMC)10.1109/SMC.2019.8914555(1360-1365)Online publication date: 6-Oct-2019

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media