[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Learning Layered Motion Segmentations of Video

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We present an unsupervised approach for learning a layered representation of a scene from a video for motion segmentation. Our method is applicable to any video containing piecewise parametric motion. The learnt model is a composition of layers, which consist of one or more segments. The shape of each segment is represented using a binary matte and its appearance is given by the rgb value for each point belonging to the matte. Included in the model are the effects of image projection, lighting, and motion blur. Furthermore, spatial continuity is explicitly modeled resulting in contiguous segments. Unlike previous approaches, our method does not use reference frame(s) for initialization. The two main contributions of our method are: (i) A novel algorithm for obtaining the initial estimate of the model by dividing the scene into rigidly moving components using efficient loopy belief propagation; and (ii) Refining the initial estimate using α β-swap and α-expansion algorithms, which guarantee a strong local minima. Results are presented on several classes of objects with different types of camera motion, e.g. videos of a human walking shot with static or translating cameras. We compare our method with the state of the art and demonstrate significant improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Agarwal, A., & Triggs, B. (2004). Tracking articulated motion using a mixture of autoregressive models. In ECCV (Vol. III, pp. 54–65).

  • Black, M., & Fleet, D. (2000). Probabilistic detection and tracking of motion discontinuities. International Journal of Computer Vision, 38, 231–245.

    Article  MATH  Google Scholar 

  • Blake, A., Rother, C., Brown, M., Perez, P., & Torr, P. H. S. (2004). Interactive image segmentation using an adaptive GMMRF model. In ECCV (Vol. I, pp. 428–441).

  • Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (Vol. I, pp. 105–112).

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  • Cremers, D., & Soatto, S. (2003). Variational space-time motion segmentation. In ICCV (Vol. II, pp. 886–892).

  • Felzenszwalb, P. F., & Huttenlocher, D. P. (2003). Fast algorithms for large state space HMMs with applications to web usage analysis. In NIPS.

  • Jojic, N., & Frey, B. (2001). Learning flexible sprites in video layers. In CVPR (Vol. 1, pp. 199–206).

  • Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.

    Article  Google Scholar 

  • Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2004). Learning layered pictorial structures from video. In ICVGIP (pp. 148–153).

  • Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005a). Learning layered motion segmentations of video. In ICCV (Vol. I, pp. 33–40).

  • Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005b). OBJ CUT. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 18–25).

  • Lafferty, J., McCallum, A., & Pereira, F. (2005). Conditional random fields: probabilistic models for segmenting and labelling sequence data. In ICML.

  • Magee, D. R., & Boyle, R. D. (2002). Detecting lameness using re-sampling condensation and multi-stream cyclic hidden Markov models. Image and Vision Computing, 20(8), 581–594.

    Article  Google Scholar 

  • Pearl, J. (1998). Probabilistic reasoning in intelligent systems: networks of plausible inference. Los Altos: Kauffman.

    Google Scholar 

  • Ramanan, D., & Forsyth, D. A. (2003). Using temporal coherence to build models of animals. In ICCV (pp. 338–345).

  • Sidenbladh, H., & Black, M. J. (2003). Learning the statistics of people in images and video. International Journal of Computer Vision, 54(1), 181–207.

    Article  Google Scholar 

  • Torr, P. H. S., & Zisserman, A. (1999). Feature based methods for structure and motion estimation. In W. Triggs, A. Zisserman, & R. Szeliski (Eds.). International workshop on vision algorithms (pp. 278–295).

  • Torr, P. H. S., Szeliski, R., & Anandan, P. (2001). An integrated Bayesian approach to layer extraction from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(3), 297–304.

    Article  Google Scholar 

  • Vogiatzis, G., Torr, P. H. S., Seitz, S., & Cipolla, R. (2004). Reconstructing relief surfaces. In BMVC (pp. 117–126).

  • Wang, J., & Adelson, E. (1994). Representing moving images with layers. IEEE Transactions on Image Processing, 3(5), 625–638.

    Article  Google Scholar 

  • Weiss, Y., & Adelson, E. A unified mixture framework for motion segmentation. In CVPR (pp. 321–326).

  • Williams, C., & Titsias, M. (2004). Greedy learning of multiple objects in images using robust statistics and factorial learning. Neural Computation, 16(5), 1039–1062.

    Article  MATH  Google Scholar 

  • Wills, J., Agarwal, S., & Belongie, S. (2003). What went where. In CVPR (pp. I:37–44).

  • Winn, J., & Blake, A. (2004). Generative affine localisation and tracking. In NIPS (pp. 1505–1512).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Pawan Kumar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pawan Kumar, M., Torr, P.H.S. & Zisserman, A. Learning Layered Motion Segmentations of Video. Int J Comput Vis 76, 301–319 (2008). https://doi.org/10.1007/s11263-007-0064-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-007-0064-x

Keywords

Navigation