Abstract
Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orthography without computing depth as an intermediate step.
An image stream can be represented by the 2F×P measurement matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this matrix is of rank 3.
Based on this observation, the factorization method uses the singular-value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively. Two of the three translation components are computed in a preprocessing stage. The method can also handle and obtain a full solution from a partially filled-in measurement matrix that may result from occlusions or tracking failures.
The method gives accurate results, and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adiv, G. 1985. Determining three-dimensional motion and structure from optical flow generated by several moving objects. IEEE Trans. Patt. Anal. Mach. Intell. 7:384–401.
Bolles, R.C., Baker, H.H., and Marimont, D.H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion, Intern. J. Comput. Vis. 1(1):7–55.
Boult, T.E., and Brown, L.G. 1991. Factorization-based segmentation of motions, Proc. IEEE Workshop on Visual Motion, pp. 179–186.
Broida, T., Chandrashekhar, S., and Chellappa, R. 1990. Recursive 3D motion estimation from a monocular image sequence, IEEE Trans. Aerospace Electroc. Syst. 26(4):639–656.
Bruss, A.R., and Horn, B.K.P. 1983. Passive navigation. Comput. Vis. Graph. Image Process. 21:3–20.
Debrunner, C., and Ahuja, N. 1992. Motion and structure factorization and segmentation of long multiple motion image sequences. In Sandini, G., ed. Europ. Conf. Comput. Vision, 1992, pp. 217–221. Springer-Verlag: Berlin, Germany.
Golub, G.H., and Reinsch, C. 1971. Singular value decomposition and least squares solutions, In Handbook for Automatic Computation, vol. 2, ch. I/10, pp. 134–151. Springer Verlag: New York.
Golub, G.H., and VanLoan, C.F. 1989. Matrix Computations. The Johns Hopkins University Press, Baltimore, MD.
Heeger, D.J., and Jepson, A. 1989. Visual perception of three-dimensional motion, Technical Report 124, MIT Media Laboratory, Cambridge, MA.
Heel, J. 1989. Dynamic motion vision. Proc. DARPA Image Understanding Workshop, Palo Alto, CA, pp. 702–713.
Horn, B.K.P., Hilden, H.M., and Negahdaripour, S. 1988. Closed-form solution of absolute orientation using orthonormal matrices. J. Op. Soc. Amer. A, 5(7):1127–1135.
Lucas, B.D., and Kanade, T. 1981. An iterative image registration technique with an application to stereo vision, Proc. 7th Intern. Joint Conf. Artif. Intell., Vancouver.
Matthies, L., Kanade, T., and Szeliski, R. 1989. Kalman filter-based algorithm for estimating depth from image sequences. Intern. J. Comput. Vis. 3(3):209–236.
Prazdny, K. 1980. Egomotion and relative depth from optical flow, Biological Cybernetics 102:87–102.
Spetsakis, M.E., and Aloimonos, J.Y. 1989. Optimal motion estimation. Proc. IEEE Workshop on Visual Motion, pp. 229–237. Irvine, CA.
Tomasi, C., and Kanade, T. 1990. Shape and motion without depth, Proc. 3rd Intern. Conf. Comput. Vis., Osaka, Japan.
Tomasi, C., and Kanade, T. 1991a. Shape and motion from image streams: a factorization method-2. point features in 3D motion. Technical Report CMU-CS-91–105, Carnegie Mellon University, Pittsburgh, PA.
Tomasi, C., and Kanade, T. 1991b. Shape and motion from image streams: a factorization method-3. detection and tracking of point features. Technical Report CMU-CS-91–132, Carnegie Mellon University, Pittsburgh, PA.
Tomasi, C. 1991. Shape and motion from image streams: a factorization method. Ph.D. thesis, Carnegie Mellon University. Also appears as Technical Report CMU-CS-91-172.
Tsai, R.Y. and Huang, T.S. 1984. Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surfaces. IEEE Trans. Patt. Anal. Mach. Intell. 6(1):13–27.
Ullman, S. 1979. The Interpretation of Visual Motion. MIT Press: Cambridge, MA.
Waxman, A.M., and Wohn, K. 1985. Contour evolution, neighborhood deformation, and global image flow: planar surfaces in motion. Intern. J. Robot. Res. 4:95–108.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Tomasi, C., Kanade, T. Shape and motion from image streams under orthography: a factorization method. Int J Comput Vision 9, 137–154 (1992). https://doi.org/10.1007/BF00129684
Issue Date:
DOI: https://doi.org/10.1007/BF00129684