Dense video super-resolution time-differential network with feature enrichment module

Lijun Wu¹,
Yong Ma¹^na1 &
Zhicong Chen¹^na1

136 Accesses
Explore all metrics

Abstract

Video super-resolution is capable of recovering high-resolution images from multiple low-resolution images, where loop structures are a common frame choice for video super-resolution tasks. BasicVSR employs bidirectional propagation and feature alignment to efficiently utilize information from the entire input video. In this work, we improved the performance of the network by revisiting the role of the various modules in BasicVSR and redesigning the network. Firstly, we will maintain centralized communication with the reference frame through the reference-based feature enrichment module after optical flow distortion, which is helpful for handling complex motion, and at the same time, for the selected keyframe, according to the degree of motion deviation of the adjacent frame relative to the keyframe, it is divided into two different regions, and the model with different receptive fields is adopted for feature extraction to further alleviate the accumulation of alignment errors. In the feature correction module, we modify the simple residual block stack to RIR structure, and fuse different levels of features with each other, which can make the final feature information more comprehensive and abundant. In addition, dense connection are introduced in the reconstruction module to promote the full use of hierarchical feature information for better reconstruction. Experimental verification is carried out on two public datasets: Vid4 and REDS4, and the comparative results show that compared with BasicVSR, the PSNR quantitative indexes of the proposed improved model on the two datasets are improved by 0.27dB and 0.33dB, respectively. In addition, from the point of view of visual perception, the model can effectively improve the clarity of the image and reduce artifacts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale Residual Dense Block for Video Super-Resolution

Nonlocal-guided enhanced interaction spatial-temporal network for compressed video super-resolution

Article 25 July 2023

Efficient Spatio-Temporal Network with Gated Fusion for Video Super-Resolution

Data availability

No datasets were generated or analysed during the current study.

References

Chan, K.C., Wang, X., Xu, X., Gu, J., Loy, C.C.: Glean: Generative latent bank for large-factor image super-resolution, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14245–14254. (2021)
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution, in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, Proceedings, Part IV 13, 2014, pp. 184–199: Springer. (2014)
Sun, Y., Qin, J., Gao, X., Chai, S., Chen, B.: Attention-enhanced multi-scale residual network for single image super-resolution. Signal. Image Video Process. 16(5), 1417–1424 (2022)
Article Google Scholar
Chen, X., Yang, R., Guo, C.: A lightweight multi-scale residual network for single image super-resolution. Signal. Image Video Process. 16(7), 1793–1801 (2022)
Article Google Scholar
Lu, X., Xie, X., Ye, C., Xing, H., Liu, Z., Chen, Y.: Single-image super-resolution via a lightweight convolutional neural network with improved shuffle learning, Signal, Image and Video Processing, pp. 1–9, (2023)
Lai, W.-S., Huang, J.-B., Ahuja, N., Yang, M.-H.: Deep laplacian pyramid networks for fast and accurate super-resolution, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 624–632. (2017)
Mu, J., Xiong, R., Fan, X., Liu, D., Wu, F., Gao, W.: Graph-based non-convex low-rank regularization for image compression artifact reduction. IEEE Trans. Image Process. 29, 5374–5385 (2020)
Article MathSciNet Google Scholar
Wei, Y., Gu, S., Li, Y., Timofte, R., Jin, L., Song, H.: Unsupervised real-world image super resolution via domain-distance aware training, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13385–13394. (2021)
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks, in Proceedings of the European conference on computer vision (ECCV), pp. 286–301. (2018)
Liao, R., Tao, X., Li, R., Ma, Z., Jia, J.: Video super-resolution via deep draft-ensemble learning, in Proceedings of the IEEE international conference on computer vision, pp. 531–539. (2015)
Dai, Q., Yoo, S., Kappeler, A., Katsaggelos, A.K.: Dictionary-based multiple frame video super-resolution, in IEEE International Conference on Image Processing (ICIP), 2015, pp. 83–87: IEEE. (2015)
Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging. 2(2), 109–122 (2016)
Article MathSciNet Google Scholar
Tian, Y., Zhang, Y., Fu, Y., Xu, C.: Tdan: Temporally-deformable alignment network for video super-resolution, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3360–3369. (2020)
Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4778–4787. (2017)
Liu, D., et al.: Robust video super-resolution with learned temporal dynamics, in Proceedings of the IEEE International Conference on Computer Vision, pp. 2507–2515. (2017)
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vision. 127, 1106–1125 (2019)
Article Google Scholar
Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6626–6634. (2018)
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4161–4170. (2017)
Sun, D., Yang, X., Liu, M.-Y., Kautz, J.: Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8934–8943. (2018)
Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: Edvr: Video restoration with enhanced deformable convolutional networks, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 0–0. (2019)
Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Understanding deformable alignment in video super-resolution, in Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 2, pp. 973–981. (2021)
Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Basicvsr: The search for essential components in video super-resolution and beyond, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4947–4956. (2021)
Isobe, T., et al.: Look back and forth: Video super-resolution with explicit temporal difference modeling, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17411–17420. (2022)
Wang, X., et al.: Esrgan: Enhanced super-resolution generative adversarial networks, in Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0. (2018)
Huang, Y., Wang, W., Wang, L.: Bidirectional recurrent convolutional networks for multi-frame super-resolution. Adv. Neural. Inf. Process. Syst., 28, (2015)
Yi, P., Wang, Z., Jiang, K., Jiang, J., Ma, J.: Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations, in Proceedings of the IEEE/CVF international conference on computer vision, pp. 3106–3115. (2019)
Li, W., Tao, X., Guo, T., Qi, L., Lu, J., Jia, J.: Mucan: Multi-correspondence aggregation network for video super-resolution, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, Proceedings, Part X 16, 2020, pp. 335–351: Springer. (2020)
Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, Proceedings, Part XII 16, 2020, pp. 645–660: Springer. (2020)
Isobe, T., et al.: Video super-resolution with temporal group attention, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8008–8017. (2020)
Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution, in Proceedings of the IEEE international conference on computer vision, pp. 4472–4480. (2017)
Dai, J., et al.: Deformable convolutional networks, in Proceedings of the IEEE international conference on computer vision, pp. 764–773. (2017)
Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets v2: More deformable, better results, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9308–9316. (2019)
Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation, in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3476–3485: IEEE. (2019)
Huang, Y., Wang, W., Wang, L.: Video super-resolution via bidirectional recurrent convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 1015–1028 (2017)
Article Google Scholar
Kim, T.H., Sajjadi, M.S., Hirsch, M., Scholkopf, B.: Spatio-temporal transformer network for video restoration, in Proceedings of the European conference on computer vision (ECCV), pp. 106–122. (2018)
Dudhane, A., Zamir, S.W., Khan, S., Khan, F.S., Yang, M.-H.: Burstormer: Burst Image Restoration and Enhancement Transformer, arXiv preprint arXiv:2304.01194, (2023)
Nah, S., et al.: Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0. (2019)
Liu, C., Sun, D.: On bayesian adaptive video super resolution. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 346–360 (2013)
Article Google Scholar
Charbonnier, P., Blanc-Feraud, L., Aubert, G., Barlaud, M.: Two deterministic half-quadratic regularization algorithms for computed imaging, in Proceedings of 1st international conference on image processing, vol. 2, pp. 168–172: IEEE. (1994)
Leng, J., Wang, J., Gao, X., Hu, B., Gan, J., Gao, C.: ICNet: Joint Alignment and Reconstruction via Iterative Collaboration for Video Super-Resolution, in Proceedings of the 30th ACM International Conference on Multimedia, pp. 6675–6684. (2022)
Lu, Z., Xiao, Z., Bai, J., Xiong, Z., Wang, X.: Can SAM Boost Video Super-Resolution? arXiv preprint arXiv:2305.06524, (2023)

Download references

Funding

This work is financially supported in parts by the National Natural Science Foundation of China (Grant Nos. 51508105 and 62271151), and the Foundation of Fujian Natural Science (Grant No.2021J01580).

Author information

Yong Ma and Zhicong Chen contributed equally to this work.

Authors and Affiliations

College of Advanced Manufacturing, Fuzhou University, Fuzhou, 350116, Fujian, China
Lijun Wu, Yong Ma & Zhicong Chen

Authors

Lijun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Zhicong Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally to this manuscript.

Corresponding author

Correspondence to Lijun Wu.

Ethics declarations

Consent to participate

All authors have agreed to participate.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, L., Ma, Y. & Chen, Z. Dense video super-resolution time-differential network with feature enrichment module. SIViP 18, 7887–7897 (2024). https://doi.org/10.1007/s11760-024-03436-2

Download citation

Received: 26 November 2023
Revised: 17 February 2024
Accepted: 09 July 2024
Published: 29 July 2024
Issue Date: November 2024
DOI: https://doi.org/10.1007/s11760-024-03436-2

Dense video super-resolution time-differential network with feature enrichment module

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-scale Residual Dense Block for Video Super-Resolution

Nonlocal-guided enhanced interaction spatial-temporal network for compressed video super-resolution

Efficient Spatio-Temporal Network with Gated Fusion for Video Super-Resolution

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Consent to participate

Competing interests

Additional information

Publisher’s Note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Dense video super-resolution time-differential network with feature enrichment module

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-scale Residual Dense Block for Video Super-Resolution

Nonlocal-guided enhanced interaction spatial-temporal network for compressed video super-resolution

Efficient Spatio-Temporal Network with Gated Fusion for Video Super-Resolution

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Consent to participate

Competing interests

Additional information

Publisher’s Note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation