More Web Proxy on the site http://driver.im/

research-article

EVRNet: Efficient Video Restoration on Edge Devices

Authors:

Vikram Mulukutla,

Vikas ChandraAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 983 - 992

https://doi.org/10.1145/3474085.3475477

Published: 17 October 2021 Publication History

Abstract

In video transmission applications, video signals are transmitted over lossy channels, resulting in low-quality received signals. To re- store videos on recipient edge devices in real-time, we introduce an efficient video restoration network, EVRNet. EVRNet efficiently allocates parameters inside the network using alignment, differential, and fusion modules. With extensive experiments on different video restoration tasks (deblocking, denoising, and super-resolution), we demonstrate that EVRNet delivers competitive performance to existing methods with significantly fewer parameters and MACs. For example, EVRNet has 260× fewer parameters and 958× fewer MACs than enhanced deformable convolution-based video restoration net- work (EDVR) for 4× video super-resolution while its SSIM score is 0.018 less than EDVR. We also evaluated the performance of EVR-Net under multiple distortions on unseen dataset to demonstrate its ability in modeling variable-length sequences under both camera and object motion.

Supplementary Material

PDF File (mfp1703aux.pdf)

Download
14.45 MB

References

[1]

Renzo Andri, Lukas Cavigelli, Davide Rossi, and Luca Benini. 2018. Yodann: An architecture for ultralow power binary-weight cnn acceleration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2018).

[2]

Wenbo Bao et al. 2019. Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE transactions on pattern analysis and machine intelligence (2019).

[3]

Jose Caballero et al. 2017. Real-time video super-resolution with spatio-temporal networks and motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4778--4787.

[4]

Francc ois Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1251--1258.

[5]

Guang Deng. 2010. A generalized unsharp masking algorithm. IEEE transactions on Image Processing, Vol. 20, 5 (2010), 1249--1261.

Digital Library

[6]

Chao Dong, Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2015. Compression artifacts reduction by a deep convolutional network. In Proceedings of the IEEE International Conference on Computer Vision. 576--584.

Digital Library

[7]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In European conference on computer vision. Springer, 184--199.

[8]

Alexey Dosovitskiy et al. 2015. Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision. 2758--2766.

Digital Library

[9]

Saurabh Gupta, Judy Hoffman, and Jitendra Malik. 2016. Cross modal distillation for supervision transfer. In CVPR. 2827--2836.

[10]

Song Han, Huizi Mao, and William J Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In ICLR.

[11]

Muhammad Haris, Gregory Shakhnarovich, and Norimichi Ukita. 2019. Recurrent back-projection network for video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3897--3906.

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026--1034.

Digital Library

[13]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[14]

Yihui He et al. 2018. Amc: Automl for model compression and acceleration on mobile devices. In ECCV.

[15]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop.

[16]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput., Vol. 9, 8 (Nov. 1997), 1735--1780.

Digital Library

[17]

Alain Hore and Djemel Ziou. 2010. Image quality metrics: PSNR vs. SSIM. In 2010 20th international conference on pattern recognition. IEEE, 2366--2369.

Digital Library

[18]

Andrew Howard et al. 2019. Searching for mobilenetv3. In Proceedings of the IEEE International Conference on Computer Vision. 1314--1324.

[19]

Andrew G Howard et al. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).

[20]

Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132--7141.

[21]

Yan Huang, Wei Wang, and Liang Wang. 2015. Bidirectional recurrent convolutional networks for multi-frame super-resolution. In Advances in Neural Information Processing Systems. 235--243.

Digital Library

[22]

Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. In NIPS.

Digital Library

[23]

E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox. 2017. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]

Younghyun Jo, Seoung Wug Oh, Jaeyeon Kang, and Seon Joo Kim. 2018. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3224--3232.

[25]

Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1646--1654.

[26]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[27]

Alexander Krull, Tim-Oliver Buchholz, and Florian Jug. 2019. Noise2void-learning denoising from single noisy images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2129--2137.

[28]

Christian Ledig et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4681--4690.

[29]

Chong Li and CJ Richard Shi. 2018. Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks. In ECCV.

[30]

Ce Liu and Deqing Sun. 2011. A bayesian approach to adaptive video super resolution. In CVPR 2011. IEEE, 209--216.

Digital Library

[31]

Ding Liu, Zhaowen Wang, Yuchen Fan, Xianming Liu, Zhangyang Wang, Shiyu Chang, and Thomas Huang. 2017. Robust video super-resolution with learned temporal dynamics. In Proceedings of the IEEE International Conference on Computer Vision. 2507--2515.

[32]

Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Zhiyong Gao, and Ming-Ting Sun. 2018. Deep kalman filtering network for video compression artifact reduction. In Proceedings of the European Conference on Computer Vision (ECCV). 568--584.

[33]

Bruce D Lucas, Takeo Kanade, et al. 1981. An iterative image registration technique with an application to stereo vision. (1981).

Digital Library

[34]

Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. 2018. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV). 116--131.

Digital Library

[35]

Matteo Maggioni, Giacomo Boracchi, Alessandro Foi, and Karen Egiazarian. 2012. Video denoising, deblocking, and enhancement through separable 4-D nonlocal spatiotemporal transforms. IEEE Transactions on image processing, Vol. 21, 9 (2012), 3952--3966.

Digital Library

[36]

Sachin Mehta, Hannaneh Hajishirzi, and Mohammad Rastegari. 2020. DiCENet: Dimension-wise Convolutions for Efficient Networks. IEEE transactions on pattern analysis and machine intelligence (December 2020).

[37]

Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi. 2018. Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In Proceedings of the european conference on computer vision (ECCV). 552--568.

Digital Library

[38]

Sachin Mehta, Mohammad Rastegari, Linda Shapiro, and Hannaneh Hajishirzi. 2019. Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9190--9200.

[39]

Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, and Jan Kautz. 2019. Importance estimation for neural network pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11264--11272.

[40]

Andrea Polesel, Giovanni Ramponi, and V John Mathews. 2000. Image enhancement via adaptive unsharp masking. IEEE transactions on image processing, Vol. 9, 3 (2000), 505--510.

Digital Library

[41]

Anurag Ranjan and Michael J Black. 2017. Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4161--4170.

[42]

Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV.

[43]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234--241.

[44]

Mehdi SM Sajjadi, Raviteja Vemulapalli, and Matthew Brown. 2018. Frame-recurrent video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6626--6634.

[45]

Mark Sandler et al. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510--4520.

[46]

Deqing Sun, Xiaodong Yang, Ming-Yu Liu, and Jan Kautz. 2018. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume. In IEEE Conference on Computer Vision and Pattern Recognition.

[47]

Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alex Alemi. 2016. Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016).

[48]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.

[49]

Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. 2019. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2820--2828.

[50]

Mingxing Tan and Quoc V Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946 (2019).

[51]

Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, and Jiaya Jia. 2017. Detail-revealing deep video super-resolution. In Proceedings of the IEEE International Conference on Computer Vision. 4472--4480.

[52]

Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu. 2020. TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3360--3369.

[53]

Tong Tong, Gen Li, Xiejie Liu, and Qinquan Gao. 2017. Image super-resolution using dense skip connections. In Proceedings of the IEEE International Conference on Computer Vision. 4799--4807.

[54]

Xintao Wang, Kelvin CK Chan, Ke Yu, Chao Dong, and Chen Change Loy. 2019. Edvr: Video restoration with enhanced deformable convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 0--0.

[55]

Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. 2018. Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV). 0--0.

[56]

Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. 2016. Learning structured sparsity in deep neural networks. In NIPS.

Digital Library

[57]

Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized convolutional neural networks for mobile devices. In CVPR.

[58]

Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T Freeman. 2019. Video enhancement with task-oriented flow. International Journal of Computer Vision, Vol. 127, 8 (2019), 1106--1125.

Digital Library

[59]

Junho Yim, Donggyu Joo, Jihoon Bae, and Junmo Kim. 2017. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In CVPR. 4133--4141.

[60]

Ruichi Yu, Ang Li, Chun-Fu Chen, Jui-Hsin Lai, Vlad I Morariu, Xintong Han, Mingfei Gao, Ching-Yung Lin, and Larry S Davis. 2018. Nisp: Pruning networks using neuron importance score propagation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9194--9203.

[61]

Songhyun Yu, Bumjun Park, Junwoo Park, and Jechang Jeong. 2020. Joint Learning of Blind Video Denoising and Optical Flow Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 500--501.

[62]

Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2017. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing, Vol. 26, 7 (2017), 3142--3155.

Digital Library

[63]

Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).

Cited By

Jeong SKim BCha SSeo KChang HLee JKim YNoh J(2024)Real-Time CNN Training and Compression for Neural-Enhanced Adaptive Live StreamingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.337737246:9(6023-6039)Online publication date: Sep-2024
https://doi.org/10.1109/TPAMI.2024.3377372
Lin LWang MYang JZhang KZhao T(2024)Toward Efficient Video Compression Artifact Detection and Removal: A Benchmark DatasetIEEE Transactions on Multimedia10.1109/TMM.2024.341454926(10816-10827)Online publication date: 2024
https://doi.org/10.1109/TMM.2024.3414549
Buades ALisani J(2024)Efficient Recurrent Real Video RestorationDigital Signal Processing10.1016/j.dsp.2024.104851(104851)Online publication date: Nov-2024
https://doi.org/10.1016/j.dsp.2024.104851
Show More Cited By

Index Terms

EVRNet: Efficient Video Restoration on Edge Devices
1. Information systems
  1. Information systems applications
    1. Mobile information processing systems

Recommendations

A Dual-Network Based Super-Resolution for Compressed High Definition Video
Advances in Multimedia Information Processing – PCM 2018
Abstract
Convolutional neural network (CNN) based super-resolution (SR) has achieved superior performance compared with traditional methods for uncompressed images/videos, but its performance degenerates dramatically for compressed content especially at ...
Video restoration based on deep learning: a comprehensive survey
Abstract
Video restoration concerns the recovery of a clean video sequence starting from its degraded version. Different video restoration tasks exist, including denoising, deblurring, super-resolution, and reduction of compression artifacts. In this paper,...
Deep Super-Resolution Network for Single Image Super-Resolution with Realistic Degradations
ICDSC 2019: Proceedings of the 13th International Conference on Distributed Smart Cameras

Single Image Super-Resolution (SISR) aims to generate a high-resolution (HR) image of a given low-resolution (LR) image. The most of existing convolutional neural network (CNN) based SISR methods usually take an assumption that a LR image is only ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

October 2021

5796 pages

ISBN:9781450386517

DOI:10.1145/3474085

General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '21

Sponsor:

SIGMM

MM '21: ACM Multimedia Conference

October 20 - 24, 2021

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
332
Total Downloads

Downloads (Last 12 months)49
Downloads (Last 6 weeks)10

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jeong SKim BCha SSeo KChang HLee JKim YNoh J(2024)Real-Time CNN Training and Compression for Neural-Enhanced Adaptive Live StreamingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.337737246:9(6023-6039)Online publication date: Sep-2024
https://doi.org/10.1109/TPAMI.2024.3377372
Lin LWang MYang JZhang KZhao T(2024)Toward Efficient Video Compression Artifact Detection and Removal: A Benchmark DatasetIEEE Transactions on Multimedia10.1109/TMM.2024.341454926(10816-10827)Online publication date: 2024
https://doi.org/10.1109/TMM.2024.3414549
Buades ALisani J(2024)Efficient Recurrent Real Video RestorationDigital Signal Processing10.1016/j.dsp.2024.104851(104851)Online publication date: Nov-2024
https://doi.org/10.1016/j.dsp.2024.104851
Yin DDong FChen BShen DZhou RGuo XHuang Z(2023)WAEVSR: Enabling Collaborative Live Video Super-Resolution in Wide-Area MEC Environment2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS)10.1109/IWQoS57198.2023.10188713(1-11)Online publication date: 19-Jun-2023
https://doi.org/10.1109/IWQoS57198.2023.10188713
Huang JCui JYe MLi SZhao Y(2022)Quality enhancement of compressed screen content video by cross-frame information fusionNeurocomputing10.1016/j.neucom.2021.12.092493:C(486-496)Online publication date: 7-Jul-2022
https://dl.acm.org/doi/10.1016/j.neucom.2021.12.092
Rota CBuzzelli MBianco SSchettini R(2022)Video restoration based on deep learning: a comprehensive surveyArtificial Intelligence Review10.1007/s10462-022-10302-556:6(5317-5364)Online publication date: 27-Oct-2022
https://dl.acm.org/doi/10.1007/s10462-022-10302-5

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents