[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Dense video super-resolution time-differential network with feature enrichment module

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Video super-resolution is capable of recovering high-resolution images from multiple low-resolution images, where loop structures are a common frame choice for video super-resolution tasks. BasicVSR employs bidirectional propagation and feature alignment to efficiently utilize information from the entire input video. In this work, we improved the performance of the network by revisiting the role of the various modules in BasicVSR and redesigning the network. Firstly, we will maintain centralized communication with the reference frame through the reference-based feature enrichment module after optical flow distortion, which is helpful for handling complex motion, and at the same time, for the selected keyframe, according to the degree of motion deviation of the adjacent frame relative to the keyframe, it is divided into two different regions, and the model with different receptive fields is adopted for feature extraction to further alleviate the accumulation of alignment errors. In the feature correction module, we modify the simple residual block stack to RIR structure, and fuse different levels of features with each other, which can make the final feature information more comprehensive and abundant. In addition, dense connection are introduced in the reconstruction module to promote the full use of hierarchical feature information for better reconstruction. Experimental verification is carried out on two public datasets: Vid4 and REDS4, and the comparative results show that compared with BasicVSR, the PSNR quantitative indexes of the proposed improved model on the two datasets are improved by 0.27dB and 0.33dB, respectively. In addition, from the point of view of visual perception, the model can effectively improve the clarity of the image and reduce artifacts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

References

  1. Chan, K.C., Wang, X., Xu, X., Gu, J., Loy, C.C.: Glean: Generative latent bank for large-factor image super-resolution, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14245–14254. (2021)

  2. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution, in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, Proceedings, Part IV 13, 2014, pp. 184–199: Springer. (2014)

  3. Sun, Y., Qin, J., Gao, X., Chai, S., Chen, B.: Attention-enhanced multi-scale residual network for single image super-resolution. Signal. Image Video Process. 16(5), 1417–1424 (2022)

    Article  Google Scholar 

  4. Chen, X., Yang, R., Guo, C.: A lightweight multi-scale residual network for single image super-resolution. Signal. Image Video Process. 16(7), 1793–1801 (2022)

    Article  Google Scholar 

  5. Lu, X., Xie, X., Ye, C., Xing, H., Liu, Z., Chen, Y.: Single-image super-resolution via a lightweight convolutional neural network with improved shuffle learning, Signal, Image and Video Processing, pp. 1–9, (2023)

  6. Lai, W.-S., Huang, J.-B., Ahuja, N., Yang, M.-H.: Deep laplacian pyramid networks for fast and accurate super-resolution, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 624–632. (2017)

  7. Mu, J., Xiong, R., Fan, X., Liu, D., Wu, F., Gao, W.: Graph-based non-convex low-rank regularization for image compression artifact reduction. IEEE Trans. Image Process. 29, 5374–5385 (2020)

    Article  MathSciNet  Google Scholar 

  8. Wei, Y., Gu, S., Li, Y., Timofte, R., Jin, L., Song, H.: Unsupervised real-world image super resolution via domain-distance aware training, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13385–13394. (2021)

  9. Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks, in Proceedings of the European conference on computer vision (ECCV), pp. 286–301. (2018)

  10. Liao, R., Tao, X., Li, R., Ma, Z., Jia, J.: Video super-resolution via deep draft-ensemble learning, in Proceedings of the IEEE international conference on computer vision, pp. 531–539. (2015)

  11. Dai, Q., Yoo, S., Kappeler, A., Katsaggelos, A.K.: Dictionary-based multiple frame video super-resolution, in IEEE International Conference on Image Processing (ICIP), 2015, pp. 83–87: IEEE. (2015)

  12. Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging. 2(2), 109–122 (2016)

    Article  MathSciNet  Google Scholar 

  13. Tian, Y., Zhang, Y., Fu, Y., Xu, C.: Tdan: Temporally-deformable alignment network for video super-resolution, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3360–3369. (2020)

  14. Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4778–4787. (2017)

  15. Liu, D., et al.: Robust video super-resolution with learned temporal dynamics, in Proceedings of the IEEE International Conference on Computer Vision, pp. 2507–2515. (2017)

  16. Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vision. 127, 1106–1125 (2019)

    Article  Google Scholar 

  17. Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6626–6634. (2018)

  18. Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4161–4170. (2017)

  19. Sun, D., Yang, X., Liu, M.-Y., Kautz, J.: Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8934–8943. (2018)

  20. Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: Edvr: Video restoration with enhanced deformable convolutional networks, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 0–0. (2019)

  21. Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Understanding deformable alignment in video super-resolution, in Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 2, pp. 973–981. (2021)

  22. Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Basicvsr: The search for essential components in video super-resolution and beyond, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4947–4956. (2021)

  23. Isobe, T., et al.: Look back and forth: Video super-resolution with explicit temporal difference modeling, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17411–17420. (2022)

  24. Wang, X., et al.: Esrgan: Enhanced super-resolution generative adversarial networks, in Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0. (2018)

  25. Huang, Y., Wang, W., Wang, L.: Bidirectional recurrent convolutional networks for multi-frame super-resolution. Adv. Neural. Inf. Process. Syst., 28, (2015)

  26. Yi, P., Wang, Z., Jiang, K., Jiang, J., Ma, J.: Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations, in Proceedings of the IEEE/CVF international conference on computer vision, pp. 3106–3115. (2019)

  27. Li, W., Tao, X., Guo, T., Qi, L., Lu, J., Jia, J.: Mucan: Multi-correspondence aggregation network for video super-resolution, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, Proceedings, Part X 16, 2020, pp. 335–351: Springer. (2020)

  28. Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, Proceedings, Part XII 16, 2020, pp. 645–660: Springer. (2020)

  29. Isobe, T., et al.: Video super-resolution with temporal group attention, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8008–8017. (2020)

  30. Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution, in Proceedings of the IEEE international conference on computer vision, pp. 4472–4480. (2017)

  31. Dai, J., et al.: Deformable convolutional networks, in Proceedings of the IEEE international conference on computer vision, pp. 764–773. (2017)

  32. Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets v2: More deformable, better results, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9308–9316. (2019)

  33. Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation, in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3476–3485: IEEE. (2019)

  34. Huang, Y., Wang, W., Wang, L.: Video super-resolution via bidirectional recurrent convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 1015–1028 (2017)

    Article  Google Scholar 

  35. Kim, T.H., Sajjadi, M.S., Hirsch, M., Scholkopf, B.: Spatio-temporal transformer network for video restoration, in Proceedings of the European conference on computer vision (ECCV), pp. 106–122. (2018)

  36. Dudhane, A., Zamir, S.W., Khan, S., Khan, F.S., Yang, M.-H.: Burstormer: Burst Image Restoration and Enhancement Transformer, arXiv preprint arXiv:2304.01194, (2023)

  37. Nah, S., et al.: Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0. (2019)

  38. Liu, C., Sun, D.: On bayesian adaptive video super resolution. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 346–360 (2013)

    Article  Google Scholar 

  39. Charbonnier, P., Blanc-Feraud, L., Aubert, G., Barlaud, M.: Two deterministic half-quadratic regularization algorithms for computed imaging, in Proceedings of 1st international conference on image processing, vol. 2, pp. 168–172: IEEE. (1994)

  40. Leng, J., Wang, J., Gao, X., Hu, B., Gan, J., Gao, C.: ICNet: Joint Alignment and Reconstruction via Iterative Collaboration for Video Super-Resolution, in Proceedings of the 30th ACM International Conference on Multimedia, pp. 6675–6684. (2022)

  41. Lu, Z., Xiao, Z., Bai, J., Xiong, Z., Wang, X.: Can SAM Boost Video Super-Resolution? arXiv preprint arXiv:2305.06524, (2023)

Download references

Funding

This work is financially supported in parts by the National Natural Science Foundation of China (Grant Nos. 51508105 and 62271151), and the Foundation of Fujian Natural Science (Grant No.2021J01580).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed equally to this manuscript.

Corresponding author

Correspondence to Lijun Wu.

Ethics declarations

Consent to participate

All authors have agreed to participate.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, L., Ma, Y. & Chen, Z. Dense video super-resolution time-differential network with feature enrichment module. SIViP 18, 7887–7897 (2024). https://doi.org/10.1007/s11760-024-03436-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-024-03436-2

Keywords

Navigation