Abstract
We propose a hierarchical process for inferring the 3D pose of a person from monocular images. First we infer a learned view-based 2D body model from a single image using non-parametric belief propagation. This approach integrates information from bottom-up body-part proposal processes and deals with self-occlusion to compute distributions over limb poses. Then, we exploit a learned Mixture of Experts model to infer a distribution of 3D poses conditioned on 2D poses. This approach is more general than recent work on inferring 3D pose directly from silhouettes since the 2D body model provides a richer representation that includes the 2D joint angles and the poses of limbs that may be unobserved in the silhouette. We demonstrate the method in a laboratory setting where we evaluate the accuracy of the 3D poses against ground truth data. We also estimate 3D body pose in a monocular image sequence. The resulting 3D estimates are sufficiently accurate to serve as proposals for the Bayesian inference of 3D human motion over time.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agarwal, A., Triggs, B.: Learning to track 3D human motion from silhouettes. In: ICML, pp. 9–16 (2004)
Agarwal, A., Triggs, B.: 3D human pose from silhouettes by relevance vector regression. In: CVPR, vol. 2, pp. 882–888 (2004)
Balan, A., Sigal, L., Black, M.: A quantitative evaluation of video-based 3D person tracking. In: VS-PETS, pp. 349–356 (2005)
Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. IJCV 61(2), 185–205 (2004)
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. IJCV 61(1), 55–79 (2005)
Howe, N.R., Leventon, M.E., Freeman, W.T.: Bayesian reconstruction of (3D) human motion from single-camera video. In: NIPS, pp. 820–826 (1999)
Hua, G., Yang, M.-H., Wu, Y.: Learning to estimate human pose with data driven belief propagation. In: CVPR, vol. 2, pp. 747–754 (2005)
Isard, M.: Pampas: Real-valued graphical models for computer vision. In: CVPR, vol. 1, pp. 613–620 (2003)
Ju, S., Black, M., Yacoob, Y.: Cardboard people: A parametrized model of articulated motion. In: Int. Conf. on Automatic Face and Gesture Recognition, pp. 38–44 (1996)
Lan, X., Huttenlocher, D.: A unified spatio-temporal articulated model for tracking. In: CVPR, vol. 1, pp. 722–729 (2004)
Lee, M., Cohen, I.: Proposal maps driven MCMC for estimating human body pose in static images. In: CVPR, vol. 2, pp. 334–341 (2004)
Mori, G.: Guiding model search using segmentation. In: ICCV, pp. 1417–1423 (2005)
Mori, G., Ren, X., Efros, A., Malik, J.: Recovering human body configurations: Combining segmentation and recognition. In: CVPR, vol. 2, pp. 326–333 (2004)
Ramanan, D., Forsyth, D., Zisserman, A.: Strike a pose: Tracking people by finding stylized poses. In: CVPR, vol. 1, pp. 271–278 (2005)
Ramanan, D., Forsyth, D.: Finding and tracking people from the bottom up. In: CVPR, vol. 2, pp. 467–474 (2003)
Roberts, T.J., McKenna, S.J., Ricketts, I.W.: Human pose estimation using learnt probabilistic region similarities and partial configurations. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3024, pp. 291–303. Springer, Heidelberg (2004)
Rosales, R., Sclaroff, S.: Inferring body pose without tracking body parts. In: CVPR, vol. 2, pp. 721–727 (2000)
Sidenbladh, H., Black, M., Fleet, D.: Stochastic tracking of 3D human figures using 2D image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: ICCV, vol. 2, pp. 750–759 (2003)
Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M.: Tracking loose-limbed people. In: CVPR, vol. 1, pp. 421–428 (2004)
Sigal, L., Black, M.: Measure Locally, Reason Globally: Occlusion-sensitive articulated pose estimation. In: CVPR (2006)
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3D human motion estimation. In: CVPR, vol. 1, pp. 390–397 (2005)
Sminchisescu, C., Triggs, B.: Estimating articulated human motion with covariance scaled sampling. IJRR 22(6), 371–391 (2003)
Sudderth, E., Mandel, M., Freeman, W., Willsky, A.: Distributed occlusion reasoning for tracking with nonparametric belief propagation. In: NIPS, pp. 1369–1376 (2004)
Sudderth, E., Ihler, A., Freeman, W., Willsky, A.: Nonparametric belief propagation. In: CVPR, vol. 1, pp. 605–612 (2003)
Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single image. CVIU 80(3), 349–363 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sigal, L., Black, M.J. (2006). Predicting 3D People from 2D Pictures. In: Perales, F.J., Fisher, R.B. (eds) Articulated Motion and Deformable Objects. AMDO 2006. Lecture Notes in Computer Science, vol 4069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11789239_19
Download citation
DOI: https://doi.org/10.1007/11789239_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36031-5
Online ISBN: 978-3-540-36032-2
eBook Packages: Computer ScienceComputer Science (R0)