Abstract
We present a model-based, top-down solution to the problem of tracking the 3D position, orientation and full articulation of the human body from markerless visual observations obtained by two synchronized RGBD cameras. Inspired by recent advances to the problem of model-based hand tracking Oikonomidis et al. (Efficient Model-based 3D Tracking of Hand Articulations using Kinect, 2011), we treat human body tracking as an optimization problem that is solved using stochastic optimization techniques. We show that the proposed approach outperforms in accuracy state of the art methods that rely on a single RGBD camera. Thus, for applications that require increased accuracy and can afford the extra-complexity introduced by the second sensor, the proposed approach constitutes a viable solution to the problem of markerless human motion tracking. Our findings are supported by an extensive quantitative evaluation of the method that has been performed on a publicly available data set that is annotated with ground truth.
Similar content being viewed by others
References
Bisacco, A., Ming-Hsuan, Y., Soatto, S.: Fast human pose estimation using appearance and motion via multi-dimensional boosting regression. In: IEEE Computer Vision and Pattern Recognition (2007)
Chen, L., Wei, H., Ferryman, J.: A survey of human motion analysis using depth imagery. Pattern Recognit. Lett. 34(15), 1995–2006 (2013)
Corazza, S., Mundermann, L., Gambaretto, E., Ferrigno, G., Andriacchi, T.: Markerless motion capture through visual hull, articulated icp and subject specific model generation. Int. J. Comput. Vis. 87(1–2), 156–169 (2010)
Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. Int. J. Comput. Vis. 61(2), 185–205 (2005)
Gall, J., Rosenhahn, B., Brox, T., Seidel, H.-P.: Optimization and filtering for human motion capture. Int. J. Comput. Vis. 87(1–2), 75–92 (2010)
Gall, J., Stoll, C., de Aguiar, E., Theobalt, C., Rosenhahn, B., Seidel, H. P.: Motion capture using joint skeleton tracking and surface estimation. In: IEEE Computer Vision and Pattern Recognition, pp. 1746–1753 (2009)
Hamer, H., Schindler, K., Koller-Meier, E., Van Gool, L.: Tracking a hand manipulating an object. In: IEEE International Conference on Computer Vision (2009)
Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 Jan (1995)
Kennedy, J., Eberhart, R., Yuhui, S.: Swarm intelligence. Morgan Kaufmann (2001)
Mikic, I., Trivedi, M., Hunter, E., Cosman, P.: Human body model acquisition and tracking using voxel data. Int. J. Comput. Vis. 53(3), 199–223 (2003)
Moeslund, T.B., Hilton, A., Kru, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 90–126 (2006)
Mussi, L., Ivekovic, S., Cagnoni, S.: Markerless articulated human body tracking from multi-view video with gpu-pso. In: Tempesti, G., Tyrrell, A., Miller, J. (eds.) Evolvable systems: from biology to hardware of Lecture Notes in Computer Science, vol. 6274, pp. 97–108 (2010)
Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Berkeley MHAD: a comprehensive multimodal human action database. In: IEEE Workshop on Applications on Computer Vision (WACV) (2013)
Oikonomidis, I., Kyriazis, N., Argyros, A. A.: Markerless and Efficient 26-DOF Hand Pose Recovery. Asian Conf. Comput. Vis. 6494, 744–757 (2010)
Oikonomidis, I., Kyriazis, N., Argyros, A. A.: Efficient Model-based 3D Tracking of Hand Articulations using Kinect. In: British Machine Vision Conference. Dundee, UK (2011)
Oikonomidis, I., Kyriazis, N., Argyros, A. A.: Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE International Conference on Computer Vision (2011)
OpenNI, November. OpenNI User Guide. OpenNI organization, last viewed 19-01-2011 pp. 11–32. http://www.openni.org/documentation (2010)
Pons-Moll, G., Leal-Taixe, L., Truong, T., Rosenhahn, B.: Efficient and robust shape matching for model based human motion capture. In: Mester, R., Felsberg, M. (eds.) Pattern Recognition. Lecture Notes in Computer Science, vol. 6835, pp. 416–425. Springer, Berlin (2011)
Poppe, R.: Vision-based human motion analysis: an overview. Comput. VIsi. Image Underst. Vis. Hum. Comput. Interact. 108(1–2), 4–18 (2007)
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images Commun ACM 56(1), 116–124 (2013)
Sigal, L., Isard, M., Haussecker, H., Black, M.: Loose-limbed people: estimating 3d human pose and motion using non-parametric belief propagation. Int. J. Comput. Vis. 98(1), 15–48 (2012)
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3d human motion estimation. IEEE Comput. Vis. Pattern Recognit. 1, 390–397 (2005)
Smisek, J., Jancosek, M., Pajdla, T.: 3d with kinect. In: IEEE ICCV Workshops. pp. 1154–1160 (2011)
Tzevanidis, K., Zabulis, X., Sarmis, T., Koutlemanis, P., Kyriazis, N., Argyros, A.: From multiple views to textured 3d meshes: a gpu-powered approach. ECCV Workshops , pp. 5–11 (2010)
Vicon.: Vicon: Motion capture systems. http://www.vicon.com (2013)
Vijay, J., Trucco, E., Ivekovic, S.: Markerless human articulated tracking using hierarchical particle swarm optimisation. Image Vis. Comput. 28(11), 1530–1547 (2010)
Wilson, J. L.: Microsoft kinect for xbox 360. PC Magazine Communications (2010)
Zhang, L., Sturm, J., Cremers, D., Lee, D.: Real-time human motion tracking using multiple depth cameras. In: Proceedings of the International Conference on Intelligent Robot Systems (IROS) (Oct. 2012) (2012)
Acknowledgments
This work was partially funded by the European Commission under contract FP7-IST-288146 HOBBIT and by the European Union (European Social Fund—ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF)—Research Funding Project: THALIS-UOA-ERASITECHNIS MIS 375435.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Michel, D., Panagiotakis, C. & Argyros, A.A. Tracking the articulated motion of the human body with two RGBD cameras. Machine Vision and Applications 26, 41–54 (2015). https://doi.org/10.1007/s00138-014-0651-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-014-0651-0