Abstract
The Lucas–Kanade tracker (LKT) is a commonly used method to track target objects over 2D images. The key principle behind the object tracking of an LKT is to warp the object appearance so as to minimize the difference between the warped object’s appearance and a pre-stored template. Accordingly, the 2D pose of the tracked object in terms of translation, rotation, and scaling can be recovered from the warping. To extend the LKT for 3D pose estimation, a model-based 3D LKT assumes a 3D geometric model for the target object in the 3D space and tries to infer the 3D object motion by minimizing the difference between the projected 2D image of the 3D object and the pre-stored 2D image template. In this paper, we propose an extended model-based 3D LKT for estimating 3D head poses by tracking human heads on video sequences. In contrast to the original model-based 3D LKT, which uses a template with each pixel represented by a single intensity value, the proposed model-based 3D LKT exploits an adaptive template with each template pixel modeled by a continuously updated Gaussian distribution during head tracking. This probabilistic template modeling improves the tracker’s ability to handle temporal fluctuation of pixels caused by continuous environmental changes such as varying illumination and dynamic backgrounds. Due to the new probabilistic template modeling, we reformulate the head pose estimation as a maximum likelihood estimation problem, rather than the original difference minimization procedure. Based on the new formulation, an algorithm to estimate the best head pose is derived. The experimental results show that the proposed extended model-based 3D LKT achieves higher accuracy and reliability than the conventional one does. Particularly, the proposed LKT is very effective in handling varying illumination, which cannot be well handled in the original LKT.
Similar content being viewed by others
References
Baker S., Matthews I.: Lucas–Kanade 20 years on: A unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004)
Beardsley P., Zisserman A., Murray D.: Sequential updating of projective and affine structure from motion. Int. J. Comput. Vis. 23(3), 235–259 (1997)
Bregler, C., Hertzmann, A., Biermann, H.: Recovering non-rigid 3d shape from image streams. IEEE Conf. Comput. Vis. Pattern Recogn. 2 (2000)
Cootes T., Edwards G., Taylor C.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)
Domaik F., Davoine F.: On appearance based face and facial action tracking. IEEE Trans. Circuits Syst. Video Technol. 16(9), 1107–1124 (2006)
Duda R., Hart P., Stork D.: Pattern Classification. Wiley, New York (2000)
Hou, W., Ding, M., Chen, D.: 3D reconstruction of human head based on stereovision. Bioinformatics and Biomedical Engineering, 2007. ICBBE 2007. The First International Conference, pp. 944–947 (2007)
Koenderink J., van Doorn A.: Affine structure from motion. J. Opt. Soc. Am. A 8(2), 377–385 (1991)
Ma Y.: An Invitation to 3-D Vision: From Images to Geometric Models. Springer, Berlin (2004)
Matsui T., Suganuma N., Fujiwara N., Kageyama I., Kuriyagawa Y.: 3d face modeling and head pose measurement using stereovision. Trans. Jpn. Soc. Mech. Eng. 72(724), 3812–3817 (2006)
Moreno, F., Tarrida, A., Andrade-Cetto, J., Sanfeliu, A.: 3D real-time head tracking fusing color histograms and stereovision. Pattern Recogn. 1 (2002)
Seemann, E.: Estimating Head Orientation with Stereo Vision. Ph.D. thesis, Diplomarbeit, Universitat Karlsruhe (2003)
Sturm P., Triggs B.: A factorization based algorithm for multi-image projective structure and motion. Proc. ECCV 2, 709–720 (1996)
Tomasi C., Kanade T.: Shape and motion from image streams under orthography: a factorization method. Int. J. Comput. Vis. 9(2), 137–154 (1992)
Triggs, B.: Factorization methods for projective structure and motion. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, San Francisco, California, pp. 845–851 (1996)
Xiao J., Chai J., Kanade T.: A closed-form solution to non-rigid shape and motion recovery. Int. J. Comput. Vis. 67(2), 233–246 (2006)
Xiao J., Moriyama T., Kanade T., Cohn J.: Robust full-motion recovery of head by dynamic templates and re-registration techniques. Int. J. Imaging Syst. Technol. 13(1), 85–94 (2003)
Yang, R., Zhang, Z.: Model-based head pose tracking with stereovision. Automatic Face and Gesture Recognition. In: Proceedings of Fifth IEEE International Conference, pp. 242–247 (2002)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported under the granted project 95-2221-E-259-011-MY2 from the National Science Council of Taiwan.
Rights and permissions
About this article
Cite this article
Chen, ZW., Chiang, CC. & Hsieh, ZT. Extending 3D Lucas–Kanade tracking with adaptive templates for head pose estimation. Machine Vision and Applications 21, 889–903 (2010). https://doi.org/10.1007/s00138-009-0222-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-009-0222-y