Abstract
Video super-resolution is capable of recovering high-resolution images from multiple low-resolution images, where loop structures are a common frame choice for video super-resolution tasks. BasicVSR employs bidirectional propagation and feature alignment to efficiently utilize information from the entire input video. In this work, we improved the performance of the network by revisiting the role of the various modules in BasicVSR and redesigning the network. Firstly, we will maintain centralized communication with the reference frame through the reference-based feature enrichment module after optical flow distortion, which is helpful for handling complex motion, and at the same time, for the selected keyframe, according to the degree of motion deviation of the adjacent frame relative to the keyframe, it is divided into two different regions, and the model with different receptive fields is adopted for feature extraction to further alleviate the accumulation of alignment errors. In the feature correction module, we modify the simple residual block stack to RIR structure, and fuse different levels of features with each other, which can make the final feature information more comprehensive and abundant. In addition, dense connection are introduced in the reconstruction module to promote the full use of hierarchical feature information for better reconstruction. Experimental verification is carried out on two public datasets: Vid4 and REDS4, and the comparative results show that compared with BasicVSR, the PSNR quantitative indexes of the proposed improved model on the two datasets are improved by 0.27dB and 0.33dB, respectively. In addition, from the point of view of visual perception, the model can effectively improve the clarity of the image and reduce artifacts.
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Chan, K.C., Wang, X., Xu, X., Gu, J., Loy, C.C.: Glean: Generative latent bank for large-factor image super-resolution, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14245–14254. (2021)
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution, in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, Proceedings, Part IV 13, 2014, pp. 184–199: Springer. (2014)
Sun, Y., Qin, J., Gao, X., Chai, S., Chen, B.: Attention-enhanced multi-scale residual network for single image super-resolution. Signal. Image Video Process. 16(5), 1417–1424 (2022)
Chen, X., Yang, R., Guo, C.: A lightweight multi-scale residual network for single image super-resolution. Signal. Image Video Process. 16(7), 1793–1801 (2022)
Lu, X., Xie, X., Ye, C., Xing, H., Liu, Z., Chen, Y.: Single-image super-resolution via a lightweight convolutional neural network with improved shuffle learning, Signal, Image and Video Processing, pp. 1–9, (2023)
Lai, W.-S., Huang, J.-B., Ahuja, N., Yang, M.-H.: Deep laplacian pyramid networks for fast and accurate super-resolution, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 624–632. (2017)
Mu, J., Xiong, R., Fan, X., Liu, D., Wu, F., Gao, W.: Graph-based non-convex low-rank regularization for image compression artifact reduction. IEEE Trans. Image Process. 29, 5374–5385 (2020)
Wei, Y., Gu, S., Li, Y., Timofte, R., Jin, L., Song, H.: Unsupervised real-world image super resolution via domain-distance aware training, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13385–13394. (2021)
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks, in Proceedings of the European conference on computer vision (ECCV), pp. 286–301. (2018)
Liao, R., Tao, X., Li, R., Ma, Z., Jia, J.: Video super-resolution via deep draft-ensemble learning, in Proceedings of the IEEE international conference on computer vision, pp. 531–539. (2015)
Dai, Q., Yoo, S., Kappeler, A., Katsaggelos, A.K.: Dictionary-based multiple frame video super-resolution, in IEEE International Conference on Image Processing (ICIP), 2015, pp. 83–87: IEEE. (2015)
Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging. 2(2), 109–122 (2016)
Tian, Y., Zhang, Y., Fu, Y., Xu, C.: Tdan: Temporally-deformable alignment network for video super-resolution, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3360–3369. (2020)
Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4778–4787. (2017)
Liu, D., et al.: Robust video super-resolution with learned temporal dynamics, in Proceedings of the IEEE International Conference on Computer Vision, pp. 2507–2515. (2017)
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vision. 127, 1106–1125 (2019)
Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6626–6634. (2018)
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4161–4170. (2017)
Sun, D., Yang, X., Liu, M.-Y., Kautz, J.: Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8934–8943. (2018)
Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: Edvr: Video restoration with enhanced deformable convolutional networks, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 0–0. (2019)
Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Understanding deformable alignment in video super-resolution, in Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 2, pp. 973–981. (2021)
Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Basicvsr: The search for essential components in video super-resolution and beyond, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4947–4956. (2021)
Isobe, T., et al.: Look back and forth: Video super-resolution with explicit temporal difference modeling, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17411–17420. (2022)
Wang, X., et al.: Esrgan: Enhanced super-resolution generative adversarial networks, in Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0. (2018)
Huang, Y., Wang, W., Wang, L.: Bidirectional recurrent convolutional networks for multi-frame super-resolution. Adv. Neural. Inf. Process. Syst., 28, (2015)
Yi, P., Wang, Z., Jiang, K., Jiang, J., Ma, J.: Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations, in Proceedings of the IEEE/CVF international conference on computer vision, pp. 3106–3115. (2019)
Li, W., Tao, X., Guo, T., Qi, L., Lu, J., Jia, J.: Mucan: Multi-correspondence aggregation network for video super-resolution, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, Proceedings, Part X 16, 2020, pp. 335–351: Springer. (2020)
Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, Proceedings, Part XII 16, 2020, pp. 645–660: Springer. (2020)
Isobe, T., et al.: Video super-resolution with temporal group attention, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8008–8017. (2020)
Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution, in Proceedings of the IEEE international conference on computer vision, pp. 4472–4480. (2017)
Dai, J., et al.: Deformable convolutional networks, in Proceedings of the IEEE international conference on computer vision, pp. 764–773. (2017)
Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets v2: More deformable, better results, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9308–9316. (2019)
Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation, in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3476–3485: IEEE. (2019)
Huang, Y., Wang, W., Wang, L.: Video super-resolution via bidirectional recurrent convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 1015–1028 (2017)
Kim, T.H., Sajjadi, M.S., Hirsch, M., Scholkopf, B.: Spatio-temporal transformer network for video restoration, in Proceedings of the European conference on computer vision (ECCV), pp. 106–122. (2018)
Dudhane, A., Zamir, S.W., Khan, S., Khan, F.S., Yang, M.-H.: Burstormer: Burst Image Restoration and Enhancement Transformer, arXiv preprint arXiv:2304.01194, (2023)
Nah, S., et al.: Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0. (2019)
Liu, C., Sun, D.: On bayesian adaptive video super resolution. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 346–360 (2013)
Charbonnier, P., Blanc-Feraud, L., Aubert, G., Barlaud, M.: Two deterministic half-quadratic regularization algorithms for computed imaging, in Proceedings of 1st international conference on image processing, vol. 2, pp. 168–172: IEEE. (1994)
Leng, J., Wang, J., Gao, X., Hu, B., Gan, J., Gao, C.: ICNet: Joint Alignment and Reconstruction via Iterative Collaboration for Video Super-Resolution, in Proceedings of the 30th ACM International Conference on Multimedia, pp. 6675–6684. (2022)
Lu, Z., Xiao, Z., Bai, J., Xiong, Z., Wang, X.: Can SAM Boost Video Super-Resolution? arXiv preprint arXiv:2305.06524, (2023)
Funding
This work is financially supported in parts by the National Natural Science Foundation of China (Grant Nos. 51508105 and 62271151), and the Foundation of Fujian Natural Science (Grant No.2021J01580).
Author information
Authors and Affiliations
Contributions
All authors contributed equally to this manuscript.
Corresponding author
Ethics declarations
Consent to participate
All authors have agreed to participate.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, L., Ma, Y. & Chen, Z. Dense video super-resolution time-differential network with feature enrichment module. SIViP 18, 7887–7897 (2024). https://doi.org/10.1007/s11760-024-03436-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-024-03436-2