[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Video diff: highlighting differences between similar actions in videos

Published: 02 November 2015 Publication History

Abstract

When looking at videos of very similar actions with the naked eye, it is often difficult to notice subtle motion differences between them. In this paper we introduce video diffing, an algorithm that highlights the important differences between a pair of video recordings of similar actions. We overlay the edges of one video onto the frames of the second, and color the edges based on a measure of local dissimilarity between the videos. We measure dissimilarity by extracting spatiotemporal gradients from both videos and calculating how dissimilar histograms of these gradients are at varying spatial scales. We performed a user study with 54 people to compare the ease with which users could use our method to find differences. Users gave our method an average grade of 4.04 out of 5 for ease of use, compared to 3.48 and 2.08 for two baseline approaches. Anecdotal results also show that our overlays are useful in the specific use cases of professional golf instruction and analysis of animal locomotion simulations.

References

[1]
Bae, S., et al. 2010. Computational rephotography. ACM Trans. Graph. 29, 3, 24:1--24:15.
[2]
Caspi, Y., and Irani, M. 2002. Spatio-temporal alignment of sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 11, 1409--1424.
[3]
Coach's Eye, 2015. https://www.coachseye.com.
[4]
Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. 886--893.
[5]
Dartfish, 2015. http://www.dartfish.com/Express.
[6]
Diego, F., et al. 2013. Joint spatio-temporal alignment of sequences. IEEE Transactions on Multimedia 15, 6, 1377--1387.
[7]
Dollar, P., Rabaud, V., Cottrell, G., and Belongie, S. 2005. Behavior recognition via sparse spatio-temporal features. 65--72.
[8]
Evangelidis, G. D., and Bauckhage, C. 2013. Efficient subframe video alignment using short descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 10, 2371--2386.
[9]
Grauman, K., and Darrell, T. 2007. The pyramid match kernel: Efficient learning with sets of features. J. Mach. Learn. Res. 8, 725--760.
[10]
Kläser, A., Marszałek, M., and Schmid, C. 2008. A spatio-temporal descriptor based on 3d-gradients. 995--1004.
[11]
Laptev, I. 2005. On space-time interest points. Int. J. Comput. Vision 64, 2-3, 107--123.
[12]
Lazebnik, S., et al. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2169--2178.
[13]
Liu, C., et al. 2008. Sift flow: Dense correspondence across different scenes. Proceedings of the 10th European Conference on Computer Vision: Part III, 28--42.
[14]
Padua, F. L., et al. 2008. Linear sequence-to-sequence alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 2, 304--320.
[15]
Sand, P., and teller, S. 2004. Video matching. ACM Transactions on Graphics 22, 3, 592--599.
[16]
Ubersense, 2015. http://www.ubersense.com/.
[17]
Ukrainitz, Y., and Irani, M. 2006. Aligning sequences and actions by maximizing space-time correlations. ECCV (3), 538--550.
[18]
Wadhwa, N., et al. 2013. Phase-based video motion processing. ACM Trans. Graph. (Proceedings SIGGRAPH 2013) 32, 4.
[19]
Wampler, K., et al. 2014. Generalizing locomotion style to new animals with inverse optimal regression. ACM Trans. Graph. 33, 4, 1--11.
[20]
Wang, H., et al. 2009. Evaluation of local spatio-temporal features for action recognition. 124.1--124.11.
[21]
Wu, H.-Y., et al. 2012. Eulerian video magnification for revealing subtle changes in the world. ACM Trans. Graph. (Proceedings SIGGRAPH 2012) 31, 4.

Cited By

View all
  • (2023)Surch: Enabling Structural Search and Comparison for Surgical VideosProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580772(1-17)Online publication date: 19-Apr-2023
  • (2021)Prominent Structures for Video Analysis and EditingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.297004527:7(3305-3317)Online publication date: 1-Jul-2021
  • (2020)Limb-OProceedings of the 11th Augmented Human International Conference10.1145/3396339.3396360(1-8)Online publication date: 27-May-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 34, Issue 6
November 2015
944 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2816795
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2015
Published in TOG Volume 34, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. motion
  2. spatiotemporal gradients
  3. video
  4. visualization

Qualifiers

  • Research-article

Funding Sources

  • Qatar Computing Research Institute (QCRI)
  • Quanta Computer

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)100
  • Downloads (Last 6 weeks)7
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Surch: Enabling Structural Search and Comparison for Surgical VideosProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580772(1-17)Online publication date: 19-Apr-2023
  • (2021)Prominent Structures for Video Analysis and EditingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.297004527:7(3305-3317)Online publication date: 1-Jul-2021
  • (2020)Limb-OProceedings of the 11th Augmented Human International Conference10.1145/3396339.3396360(1-8)Online publication date: 27-May-2020
  • (2019)Towards a better video comparisonProceedings of Asian CHI Symposium 2019: Emerging HCI Research Collection10.1145/3309700.3338443(80-89)Online publication date: 4-May-2019
  • (2018)Towards a better video comparisonProceedings of the 30th Australian Conference on Computer-Human Interaction10.1145/3292147.3292183(349-353)Online publication date: 4-Dec-2018
  • (2017)AnimDiffProceedings of the European Association for Computer Graphics: Short Papers10.2312/egsh.20171007(29-32)Online publication date: 24-Apr-2017
  • (2017)Enhanced Visualization of Detected 3D Geometric DifferencesComputer Graphics Forum10.1111/cgf.1323937:1(159-171)Online publication date: 28-Jun-2017
  • (2016)Interoperable Access to Video Content as a Basis for Collaborative Video Editing2016 International Conference on Collaboration Technologies and Systems (CTS)10.1109/CTS.2016.0054(233-240)Online publication date: Oct-2016

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media