Abstract
In this article we present the integration of 3-D shape knowledge into a variational model for level set based image segmentation and contour based 3-D pose tracking. Given the surface model of an object that is visible in the image of one or multiple cameras calibrated to the same world coordinate system, the object contour extracted by the segmentation method is applied to estimate the 3-D pose parameters of the object. Vice-versa, the surface model projected to the image plane helps in a top-down manner to improve the extraction of the contour. While common alternative segmentation approaches, which integrate 2-D shape knowledge, face the problem that an object can look very differently from various viewpoints, a 3-D free form model ensures that for each view the model can fit the data in the image very well. Moreover, one additionally solves the problem of determining the object’s pose in 3-D space. The performance is demonstrated by numerous experiments with a monocular and a stereo camera system.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Araújo, H., Carceroni, R.L., and Brown, C.M. 1998. A fully projective formulation to improve the accuracy of Lowe’s pose-estimation algorithm. Computer Vision and Image Understanding, 70(2):227–238.
Besl, P. and McKay, N. 1992. A method for registration of 3D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12:239–256.
Besl, P.J. 1990. The free-form surface matching problem. In Machine Vision for Three-Dimensional Scenes, H. Freemann (Ed.), Academic, Press: San Diego, pp. 25–71.
Beveridge, J.R. 1993. Local search algorithms for geometric object recognition: Optimal correspondence and pose. Technical Report Technical Report CS 93–5, University of Massachusetts, Amherst.
Blake, A. and Zisserman, A. 1987. Visual Reconstruction. MIT Press: Cambridge, MA.
Blaschke, W. 1960. Kinematik und Quaternionen, Mathematische Monographien. 4. Deutscher Verlag der Wissenschaften.
Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential maps. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Santa Barbara, California, pp. 8–15.
Bregler, C., Malik, J., and Pullen, K. 2004. Twist based acquisition and tracking of animal and human kinetics. International Journal of Computer Vision, 56(3):179–194.
Brox, T., Rosenhahn, B., and Weickert, J. 2005. Three-dimensional shape knowledge for joint image segmentation and pose estimation. In Pattern Recognition, W. Kropatsch, R. Sablatnig, and A. Hanbury (Eds.), volume 3663 of LNCS, Springer, pp. 109–116.
Brox, T. and Weickert, J. 2005. Level set segmentation with multiple regions. Technical Report 145, Dept. of Mathematics, Saarland University, Saarbrücken, Germany.
Brox, T. and Weickert, J. 2006. A TV flow based local scale estimate and its application to texture discrimination. Journal of Visual Communication and Image Representation, To appear.
Campbell, R. and Flynn, P. 2001. A survey of free-form object representation and recognition techniques. Computer Vision and Image Understanding, (81):166–210.
Caselles, V., Catté, F., Coll, T., and Dibos, F. 1993. A geometric model for active contours in image processing. Numerische Mathematik, 66:1–31.
Chan, T. and Vese, L. 1999. An active contour model without edges. In Scale-Space Theories in Computer Vision, M. Nielsen, P. Johansen, O. F. Olsen, and J. Weickert (Eds.), volume 1682 of LNCS, Springer, pp. 141–151.
Chan, T. and Vese, L. 2001. Active contours without edges. IEEE Transactions on Image Processing, 10(2):266–277.
Cremers, D., Osher, S., and Soatto, S. 2004. A multi-modal translation-invariant shape prior for level set segmentation. In Pattern Recognition, C.-E. Rasmussen, H. Bülthoff, M. Giese, and B. Schölkopf (Eds.), volume 3175 of LNCS, Springer, Berlin, pp. 36–44.
Cremers, D., Schnörr, C., and Weickert, J. 2001. Diffusion-snakes: Combining statistical shape knowledge and image information in a variational framework. In Proc. First IEEE Workshop on Variational and Level Set Methods in Computer Vision, Vancouver, Canada, IEEE Computer Society Press, pp. 137–144.
Cremers, D. and Soatto, S. 2005. Motion competition: A variational framework for piecewise parametric motion segmentation. International Journal of Computer Vision, 62(3):249–265.
Cremers, D., Tischhäuser, F., Weickert, J., and Schnörr, C. 2002. Diffusion snakes: Introducing statistical shape knowledge into the mumford-shah functional. International Journal of Computer Vision, 50(3):295–313.
Dempster, A., Laird, N., and Rubin, D. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society series B, 39:1–38.
Dervieux, A. and Thomasset, F. 1979. A finite element method for the simulation of Rayleigh–Taylor instability. In Approximation Methods for Navier–Stokes Problems, R. Rautman (Ed.), volume 771 of Lecture Notes in Mathematics, Springer pp. 145–158.
Drummond, T. and Cipolla, R. 2000. Real-time tracking of multiple articulated structures in multiple views. In Proc. 6th European Conference on Computer Vision, ECCV, Dublin, Ireland, Springer, pp. 20–36.
Faugeras, O. 1993. Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press: Cambridge, MA.
Faugeras, O. and Keriven, R. 1998. Variational principles, surface evolution, PDE’s, level set methods, and the stereo problem. IEEE Transactions on Image Processing, 7(3):336–344.
Felzenszwalb, P.F. and Huttenlocher, D.P. 2004. Distance transforms of sampled functions. Technical Report TR2004-1963, Computer Science Department, Cornell University.
Gallier, J. 2001. Geometric Methods and Applications For Computer Science and Engineering. Springer-Verlag: New York Inc.
Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721–741.
Goddard, J. 1997. Pose and Motion Estimation From Vision Using Dual Quaternion-Based Extended Kalman Filtering. PhD thesis, Knoxville.
Grimson, W.E.L. 1990. Object Recognition by Computer. MIT Press: Cambridge, MA.
Haag, M. and Nagel, H.-H. 1999. 3d-model-based vehicle tracking in traffic image sequences. International Journal of Computer Vision, 35(3):295–319.
Heiler, M. and Schnörr, C. 2005. Natural image statistics for natural image segmentation. International Journal of Computer Vision, 63(1):5–19.
Kadir, T. and Brady, M. 2003. Unsupervised non-parametric region segmentation using level sets. In Proc. Ninth IEEE International Conference on Computer Vision, vol. 2, pp. 1267–1274.
Kass, M., Witkin, A., and Terzopoulos, D. 1988. Snakes: Active contour models. International Journal of Computer Vision, 1:321–331.
Kim, J., Fisher, J., Yezzi, A., Cetin, M., and Willsky, A. 2002. Nonparametric methods for image segmentation using information theory and curve evolution. In IEEE International Conference on Image Processing, Rochester, NY vol. 3, pp. 797–800, .
Kim, J., Fisher, J., Yezzi, A., Cetin, M., and Willsky, A. 2005. A nonparametric statistical method for image segmentation using information theory and curve evolution. IEEE Transactions on Image Processing, 14(10):1486–1502.
Kriegman, D., Vijayakumar, B., and Ponce, J. 1992. Constraints for recognizing and locating curved 3D objects from monocular image features. In Proc. 2nd European Conference on Computer Vision (ECCV ’92), G. Sandini (Ed.), volume 588 of Lecture Notes in Computer Science, Springer, pp. 829–833.
Lepetit, V. and Fua, P. 2005. Monocular model-based 3D tracking of rigid objects: A survey. Computer Graphics and Vision, 1(1):1–89.
Leventon, M.E., Grimson, W.E.L., and Faugeras, O. 2000. Statistical shape influence in geodesic active contours. In Proc. 2000 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Hilton Head, SC, vol. 1, pp. 316–323.
Li, S.Z. 1995. Markov Random Field Modeling in Computer Vision. Springer Verlag: New York.
Lowe, D. 1980. Solving for the parameters of object models from image descriptions. In Proc. ARPA Image Understanding Workshop, pp. 121–127.
Lowe, D. 1987. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence, 31(3):355–395.
Ma, Y., Soatto, S., Kosecka, J., Sastry, S.S., and Soatta, S. 2003. An Invitation to 3-D Vision. Springer Verlag: New York.
Malik, J., Belongie, S., Leung, T., and Shi, J. 2001. Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43(1):7–27.
Malladi, R., Sethian, J.A., and Vemuri, B.C. 1995. Shape modeling with front propagation: A level set approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(2):158–175.
Mansouri, A., Mitiche, A., and Vázquez, C. 2004. Image partioning by level set multiregion competition. In Proc. International Conference on Image Processing, vol. 4, pp. 2721–2724.
Marchand, E., Bouthemy, P., and Chaumette, F. 2001. A 2D-3D model-based approach to real-time visual tracking. Image and Vision Computing, 19(13):941–955.
McLachlan, G. and Krishnan, T. 1997. The EM Algorithm and Extensions. Wiley series in probability and statistics. John Wiley & Sons.
Mumford, D. and Shah, J. 1989. Optimal approximations by piecewise smooth functions and associated variational problems. Communications on Pure and Applied Mathematics, 42:577–685.
Murray, R., Li, Z., and Sastry, S. 1994. Mathematical Introduction to Robotic Manipulation. CRC Press: Boca Raton, FL.
Osher, S. and Sethian, J.A. 1988. Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton–Jacobi formulations. Journal of Computational Physics, 79:12–49.
Paragios, N. and Deriche, R. 1999. Unifying boundary and region-based information for geodesic active tracking. In Proc. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Forth Collins, Colorado, vol. 2, pp. 300–305.
Paragios, N. and Deriche, R. 2002. Geodesic active regions: A new paradigm to deal with frame partition problems in computer vision. Journal of Visual Communication and Image Representation, 13(1/2):249–268.
Paragios, N. and Deriche, R. 2002. Geodesic active regions and level set methods for supervised texture segmentation. International Journal of Computer Vision, 46(3):223–247.
Paragios, N., Rousson, M., and Ramesh, V. 2003. Distance transforms for non-rigid registration. Computer Vision and Image Understanding, 23:142–165.
Riklin-Raviv, T., Kiryati, N., and Sochen, N. 2004. Unlevel-sets: Geometry and prior-based segmentation. In Proc. 8th European Conference on Computer Vision, T. Pajdla and J. Matas (Eds.), volume 3024 of LNCS, Springer, Berlin, pp. 50–61.
Rosenhahn, B. 2003. Pose Estimation Revisited. PhD thesis, University of Kiel, Germany.
Rosenhahn, B. and Sommer, G. 2004. Pose estimation of free-form objects. In Computer Vision - Proc. 8th European Conference on Computer Vision, T. Pajdla and J. Matas (Eds.), vol. 3021 of LNCS, Springer, pp. 414–427.
Rousson, M. Brox, T., and Deriche, R. 2003. Active unsupervised texture segmentation on a diffusion based feature space. In Proc. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, WI, pp. 699–704.
Rousson, M. and Deriche, R. 2002. A variational framework for active and adaptive segmentation of vector-valued images. In Proc. IEEE Workshop on Motion and Video Computing, Orlando, Florida, pp. 56–62.
Rousson, M. and Paragios, N. 2002. Shape priors for level set representations. In Computer Vision – ECCV 2002, A. Heyden, G. Sparr, M. Nielsen, and P. Johansen (Eds.), vol. 2351 of LNCS, Springer, Berlin pp. 78–92.
Rousson, M., Paragios, N., and Deriche, R. 2004. Implicit active shape models for 3D segmentation in MR imaging. In 7th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 3216 of LNCS, Springer, Berlin, pp. 209–216.
Shevlin, F. 1998. Analysis of orientation problems using Plücker lines. In International Conference on Pattern Recognition (ICPR), Brisbane vol. 1, pp. 685–689.
Shi, J. and Malik, J. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905.
Sifakis, E., Garcia, C., and Tziritas, G. 2002. Bayesian level sets for image segmentation. Journal of Visual Communication and Image Representation, 13(1/2):44–64.
Sommer, G. (Ed) 2001. Geometric Computing with Clifford Algebra. Springer Verlag: Berlin.
Tsai, A., Yezzi, A., and Willsky, A. 2001. Curve evolution implementation of the Mumford-Shah functional for image segmentation, denoising, interpolation, and magnification. IEEE Transactions on Image Processing, 10(8):1169–1186.
Vacchetti, L., Lepetit, V., and Fua, P. 2004. Stable real-time 3D tracking using online and offline information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10):1391–1391.
Vese, L. and Chan, T. 2002. A multiphase level set framework for image segmentation using the Mumford and Shah model. International Journal of Computer Vision, 50(3):271–293.
Yezzi, A. and Soatto, S. 2003a. Stereoscopic segmentation. International Journal of Computer Vision, 53(1):31–43.
Yezzi, A. and Soatto, S. 2003b. Structure from motion for scenes without features. In Proc. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, WI, vol. 1, pp. 171–178.
Yezzi, A., Zollei, L., and Kapur, T. 2001. A variational framework for joint segmentation and registration. In Proc. IEEE Workshop on Mathematical Methods in Biomedical Image Analysis, pp. 44–51.
Zerroug, M. and Nevatia, R. 1996. Pose estimation of multi-part curved objects. In Proc. Image Understanding Workshop, pp. 831–835.
Zhang, Z. 1994. Iterative points matching for registration of free form curves and surfaces. International Journal of Computer Vision, 13(2):119–152.
Zhao, H.K., Chan, T., Merriman, B., and Osher, S. 1996. A variational level set approach to multiphase motion. Journal of Computational Physics, 127:179–195.
Zhu, S.-C. and Yuille, A. 1996. Region competition: unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(9):884–900.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rosenhahn, B., Brox, T. & Weickert, J. Three-Dimensional Shape Knowledge for Joint Image Segmentation and Pose Tracking. Int J Comput Vision 73, 243–262 (2007). https://doi.org/10.1007/s11263-006-9965-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-006-9965-3