More Web Proxy on the site http://driver.im/

research-article

Intra- and Inter-frame Iterative Temporal Convolutional Networks for Video Stabilization

Authors:

Huicong WuAuthors Info & Claims

MMAsia '21: Proceedings of the 3rd ACM International Conference on Multimedia in Asia

Article No.: 28, Pages 1 - 7

https://doi.org/10.1145/3469877.3490608

Published: 10 January 2022 Publication History

Abstract

Video jitter is an uncomfortable product of irregular lens motion in time sequence. How to extract motion state information in a period of continuous video frames is a major issue for video stabilization. In this paper, we propose a novel sequence model, Intra- and Inter-frame Iterative Temporal Convolutional Networks (I3TC-Net), which alternatively transfer the spatial-temporal correlation of motion within and between frames. We hypothesize that the motion state information can be represented by transmission states. Specifically, we employ combination of Convolutional Long Short-Term Memory (ConvLSTM) and embedded encoder-decoder to generate the latent stable frame, which are used to update transmission states iteratively and learn a global homography transformation effectively for each unstable frame to generate the corresponding stabilized result along the time axis. Furthermore, we create a video dataset to solve the lack of stable data and improve the training effect. Experimental results show that our method outperforms state-of-the-art results on publicly available videos, such as 5.4 points improvements in stability score. The project page is available at https://github.com/root2022IIITC/IIITC.

References

[1]

[n.d.]. A demo of our dataset. ([n. d.]). [Online], Available: hhttps://www.youtube.com/watch?v=c9Lv73H_OCE.

[2]

[n.d.]. An example video of comparison result. ([n. d.]). [Online], Available: https://www.youtube.com/watch?v=a5vZuPchmqw.

[3]

2018. Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features. IEEE Access 6, 99 (2018), 1155–1166.

[4]

Chris Buehler, Michael Bosse, and Leonard McMillan. [n.d.]. Non-metric image-based rendering for video stabilization. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Vol. 2. II–II.

[5]

J. Choi and I. S. Kweon. 2020. DIFRINT: Deep Iterative Frame Interpolation for Full-Frame Video Stabilization. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[6]

J. L. Elman. 1990. Finding Structure in Time. Cognitive Science 14, 2 (1990), 179–211.

[7]

Yang Feng, Lin Ma, Wei Liu, and Jiebo Luo. 2019. Spatio-temporal video re-localization by warp LSTM. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1288–1297.

[8]

Amit Goldstein and Raanan Fattal. 2012. Video stabilization using epipolar geometry. ACM Transactions on Graphics (TOG) 31, 5 (2012), 1–10.

Digital Library

[9]

Matthias Grundmann, Vivek Kwatra, and Irfan Essa. [n.d.]. Auto-directed video stabilization with robust l1 optimal camera paths. In CVPR 2011. IEEE, 225–232.

Digital Library

[10]

K. Guo, N. Kim, D. Seo, I. Kim, and S. Lim. 2020. Non-Uniform Video Time-Lapse Method Based on Motion Scenario and Stabilization Constraint. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]

S. Hochreiter and J. Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (1997), 1735–1780.

Digital Library

[12]

Tae Hyun Kim, Kyoung Mu Lee, Bernhard Scholkopf, and Michael Hirsch. 2017. Online video deblurring via dynamic temporal blending network. In Proceedings of the IEEE International Conference on Computer Vision. 4038–4047.

[13]

Maria Silvia Ito and Ebroul Izquierdo. 2019. A dataset and evaluation framework for deep learning based video stabilization systems. In IEEE Visual Communications and Image Processing (VCIP). 1–4.

[14]

D. Kingma and J. Ba. 2014. Adam: A Method for Stochastic Optimization. Computer Science (2014).

[15]

Feng Liu, Michael Gleicher, Hailin Jin, and Aseem Agarwala. 2009. Content-preserving warps for 3D video stabilization. ACM Transactions on Graphics (TOG) 28, 3 (2009), 1–9.

Digital Library

[16]

Feng Liu, Michael Gleicher, Jue Wang, Hailin Jin, and Aseem Agarwala. 2011. Subspace video stabilization. ACM Transactions on Graphics (TOG) 30, 1 (2011), 1–10.

Digital Library

[17]

S. Liu, M. Li, S. Zhu, and Z. Bing. 2017. CodingFlow: Enable Video Coding for Video Stabilization. IEEE Transactions on Image Processing 26, 7 (2017), 3291–3302.

Digital Library

[18]

Shuaicheng Liu, Ping Tan, Lu Yuan, Jian Sun, and Bing Zeng. 2016. Meshflow: Minimum latency online video stabilization. In European Conference on Computer Vision. Springer, 800–815.

[19]

Shuaicheng Liu, Binhan Xu, Chuang Deng, Shuyuan Zhu, Bing Zeng, and Moncef Gabbouj. 2016. A hybrid approach for near-range video stabilization. IEEE Transactions on Circuits and Systems for Video Technology 27, 9(2016), 1922–1933.

Digital Library

[20]

Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2013. Bundled camera paths for video stabilization. ACM Transactions on Graphics (TOG) 32, 4 (2013), 1–10.

Digital Library

[21]

S. Mukherjee, S. Ghosh, S. Ghosh, P. Kumar, and P. P. Roy. 2019. Predicting Video-frames Using Encoder-convlstm Combination. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]

Seungjun Nah, Sanghyun Son, and Kyoung Mu Lee. 2019. Recurrent neural networks with intra-frame iterations for video deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8102–8111.

[23]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234–241.

[24]

Carlo Tomasi and Takeo Kanade. 1991. Detection and tracking of point features. (1991).

[25]

Aaron Van Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel recurrent neural networks. In International Conference on Machine Learning. PMLR, 1747–1756.

[26]

Miao Wang, Guo-Ye Yang, Jin-Kun Lin, Song-Hai Zhang, Ariel Shamir, Shao-Ping Lu, and Shi-Min Hu. 2018. Deep online video stabilization with multi-grid warping transformation learning. IEEE Transactions on Image Processing 28, 5 (2018), 2283–2292.

[27]

Y. Wang, W. K. Zhang, Q. Liu, Z. Zhang, and X. Sun. 2020. Improving Intra- and Inter-Modality Visual Relation for Image Captioning. In MM ’20: The 28th ACM International Conference on Multimedia.

[28]

SHI Xingjian, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-Kin Wong, and Wang-chun Woo. 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems. 802–810.

[29]

Sen-Zhe Xu, Jun Hu, Miao Wang, Tai-Jiang Mu, and Shi-Min Hu. 2018. Deep video stabilization using adversarial networks. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 267–276.

[30]

D Zhang, W. Zhang, S. Li, Q. Zhu, and G. Zhou. 2020. Modeling both Intra- and Inter-modal Influence for Real-Time Emotion Detection in Conversations. In MM ’20: The 28th ACM International Conference on Multimedia.

[31]

B. Zhao, X. Li, and X. Lu. 2018. HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]

Minda Zhao and Qiang Ling. 2020. PWStableNet: Learning Pixel-Wise Warping Maps for Video Stabilization. IEEE Transactions on Image Processing 29 (2020), 3582–3595.

Digital Library

Index Terms

Intra- and Inter-frame Iterative Temporal Convolutional Networks for Video Stabilization
1. Computing methodologies

Index terms have been assigned to the content through auto-classification.

Recommendations

Deep Iterative Frame Interpolation for Full-frame Video Stabilization

Video stabilization is a fundamental and important technique for higher quality videos. Prior works have extensively explored video stabilization, but most of them involve cropping of the frame boundaries and introduce moderate levels of distortion. We ...
Complexity-based intra frame rate control by jointing inter-frame correlation for high efficiency video coding

An intra-frame rate control algorithm by jointing inter-frame correlation is developed.A new prediction measure of content complexity for CTUs of intra-frame is proposed.A frame-level complexity-based bit-allocation-balancing method is brought up.A new ...
Fast inter-frame coding with intra skip strategy in H.264 video coding

Inter-frame coding in the H.264/AVC standard must address inter modes and intra modes when seeking the best coding mode. Despite achieving a higher coding efficiency than any other previous coding standards, H.264/AVC also has a significantly high ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '21: Proceedings of the 3rd ACM International Conference on Multimedia in Asia

December 2021

508 pages

ISBN:9781450386074

DOI:10.1145/3469877

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 January 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

MMAsia '21

Sponsor:

SIGMM

MMAsia '21: ACM Multimedia Asia

December 1 - 3, 2021

Gold Coast, Australia

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
92
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)5

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents