A Transformer-Based Model for Super-Resolution of Anime Image
<p>Anime image changes at different resolutions. The anime image resolution gradually increases from left to right. The image is adopted from [<a href="#B3-sensors-22-08126" class="html-bibr">3</a>]. We were given authorization to use an illustration copyrighted by Ms. Dai.</p> "> Figure 2
<p>The network structure of super-resolution methods.</p> "> Figure 3
<p>The blur result of interpolation-based algorithms. The image on the left is the original anime image. The image on the right is the super-resolution anime image obtained by the interpolation algorithm. The conventional interpolation algorithm does not consider the characteristics of the edge. After the image was processed, the blurring phenomenon can be noticed at the edge, which affects the quality of the image. The image is adopted from [<a href="#B3-sensors-22-08126" class="html-bibr">3</a>].</p> "> Figure 4
<p>The results of super-resolution methods. (<b>a</b>) Original high-resolution anime image. (<b>b</b>) SR anime image generated by Waifu2x. (<b>c</b>) SR anime image generated by Real-ESRGAN. (<b>d</b>) SR anime image generated by SwinIR (image restoration using Swin Transformer). For Waifu2x, the blur and checkerboard artifacts problems are in the green and blue circles. For Real-ESRGAN, the circled area shows the hallucinations problem. For SwinIR, the circled area shows the lack of detail, such as line misjudgments. The image is adopted from [<a href="#B3-sensors-22-08126" class="html-bibr">3</a>].</p> "> Figure 5
<p>Overall network structure.</p> "> Figure 6
<p>Shallow feature extraction network structure.</p> "> Figure 7
<p>Changes in the spectrogram before and after Gaussian filtering.</p> "> Figure 8
<p>Feature maps extracted by different parts of the shallow feature extraction network.</p> "> Figure 9
<p>Shallow feature map.</p> "> Figure 10
<p>Architecture of a Swin Transformer.</p> "> Figure 11
<p>Two successive Swin Transformer layers. W-MSA is window self-attention mechanism. SW-MAS is shifted window self-attention mechanism.</p> "> Figure 12
<p>Swin Transformer block (SWTB).</p> "> Figure 13
<p>Self-attention window changes in a deep feature extraction network. In the beginning, we divide the image into several small windows. In the last stage, we divide the image into four regions.</p> "> Figure 14
<p>Deep feature map.</p> "> Figure 15
<p>An upsampling example of reconstruction work.</p> "> Figure 16
<p>Example images from the anime face dataset (from Seeprettyface and Pixiv). HR is 512 × 512, LR are 64 × 64, 128 × 128, 256 × 256. The images are adopted from [<a href="#B37-sensors-22-08126" class="html-bibr">37</a>,<a href="#B38-sensors-22-08126" class="html-bibr">38</a>].</p> "> Figure 17
<p>Example images from the anime character image data (full-body and half-body; from Pixiv). HR is 512 × 512, LR are 64 × 64, 128 × 128, 256 × 256. The images are adopted from [<a href="#B3-sensors-22-08126" class="html-bibr">3</a>].</p> "> Figure 18
<p>Example images from the AnimeCharacter12 dataset. HR is 1024 × 1024, LR are 128 × 128, 256 × 256, 512 × 512. The images are adopted from [<a href="#B3-sensors-22-08126" class="html-bibr">3</a>].</p> "> Figure 19
<p>Example images from the Multi-level anime83 dataset. HR is 1024 × 1024, LR are 128 × 128, 256 × 256, 512 × 512. The images are adopted from [<a href="#B3-sensors-22-08126" class="html-bibr">3</a>].</p> "> Figure 20
<p>An example of visual result of an anime face on test dataset (AnimeFace180). To show the details of the generated anime image, we zoomed in on the anime face. The red box area is the area we intercepted and enlarged—visual result on the anime face.</p> "> Figure 21
<p>An example of visual result of an anime character on test dataset (Multi-level anime83). The red frame area is the hand details.</p> "> Figure 22
<p>A result of an example image from the Animeface180 test dataset for different super-resolution methods. Facial detail comparison at 2× super-resolution task.</p> "> Figure 23
<p>A result of an example image from the Animeface180 test dataset for different super-resolution methods. Facial detail comparison at 4× super-resolution task.</p> "> Figure 24
<p>A result of an example image from the Animeface180 test dataset for different super-resolution methods. Facial detail comparison at 8× super-resolution task.</p> "> Figure 25
<p>Result of an example image from the AnimeCharacter12 test dataset for different super-resolution methods at 2×, 4×, and 8× super-resolution task.</p> "> Figure 26
<p>The results of three example images from the Multi-level anime83 test dataset for different super-resolution methods at 4× super-resolution task.</p> ">
Abstract
:1. Introduction
1.1. Background
1.2. Image Super-Resolution Technology
1.2.1. Fundamentals
1.2.2. State-of-the-Art Techniques
- (a)
- Common network structures
- (b)
- Recent studies on image super-resolution reconstruction
1.3. The Research Questions, Rationale, and the Context for the Study
1.3.1. Quality of Super-Resolution Images
1.3.2. The Limitations of Anime Datasets
1.4. Contributions and Paper Outline
- We proposed an anime image super-resolution network structure based on Swin Transformer.
- We modified the conventional Swin Transformer to improve the global awareness capability of the feature extraction network.
- We strengthened the extraction of low-frequency information given the richness of spatial information in anime images.
- Before the upsampling stage, shallow features were fused with deep features to provide more detailed information for the final result.
- The experimental results were compared numerically and graphically with those delivered by conventional convolutional neural network-based and transformer-based methods.
- The series of experiments and ablation study disclose anime image super-resolution tasks at different magnifications (2×, 4×, 8×).
- We constructed our anime dataset to compensate for the lack of anime super-resolution task datasets.
- Our approach speeds up the creative cycle for creators who can create at a low-resolution level and then revert to a high-resolution image.
2. Proposed Methods
2.1. Shallow Feature Extraction Network
2.2. Deep Feature Extraction Network
2.2.1. General Definition
2.2.2. Swin Transformer Block (SWTB)
2.3. Reconstruction Network
2.4. Loss Function
2.5. Network Parameter Settings
3. Dataset
3.1. Constitution of the New Anime Dataset
3.1.1. Training Datasets
3.1.2. Validation Datasets
3.1.3. Test Datasets
4. Experimental Results
4.1. Environment Settings
4.2. Image Quality Assessment
4.3. Results
4.3.1. Results on Anime Face and Characters
4.3.2. Ablation Studies
4.3.3. Comparison against the State-of-the-Art Methods
- (1)
- Comparison on AnimeFace180
- (2)
- Comparison on AnimeCharacter12
- (3)
- Comparison on Multi-level anime83
- (4)
- The runtime comparison with different methods
4.4. Discussion
4.4.1. Resolution of Blur and Checkerboard Artifacts
4.4.2. Resolution of Ignore Details
4.4.3. Work Limitations
- (a)
- The window expansion mechanism is subject to computer computing power
- (b)
- The enhancement of low-frequency information cannot adapt to tasks of different scales
- (c)
- Use of artificially generated low-resolution image
- (d)
- Longer runtime
5. Concluding Remarks and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kelsky, K. Japan Pop! Inside the World of Japanese Popular Culture. Edited by Timothy J. Craig. Armonk, N.Y.: M.E. Sharpe Inc., 2000. ix, 360 pp. $64.95. J. Asian Stud. 2001, 60, 548–549. [Google Scholar] [CrossRef]
- Napier, S.J. Anime from Akira to Howl’s Moving Castle: Experiencing Contemporary Japanese Animation; St. Martin’s Griffin: New York, NY, USA, 2016. [Google Scholar]
- Miss Dai. Available online: https://weibo.com/u/7520558714 (accessed on 18 October 2022).
- Keys, R. Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust. Speech Signal Process. 1981, 29, 1153–1160. [Google Scholar] [CrossRef] [Green Version]
- Irani, M.; Peleg, S. Improving resolution by image registration. CVGIP Graph. Model. Image Process. 1991, 53, 231–239. [Google Scholar] [CrossRef]
- Kim, K.I.; Kwon, Y. Single-image super-resolution using sparse regression and natural image prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1127–1133. [Google Scholar] [PubMed]
- Xiong, Z.; Sun, X.; Wu, F. Robust web image/video super-resolution. IEEE Trans. Image Process. 2010, 19, 2017–2028. [Google Scholar] [CrossRef] [PubMed]
- Freeman, W.T.; Jones, T.R.; Pasztor, E.C. Example-based super-resolution. IEEE Comput. Graph. Appl. 2002, 22, 56–65. [Google Scholar] [CrossRef] [Green Version]
- Chang, H.; Yeung, D.Y.; Xiong, Y. Super-resolution through neighbor embedding. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 27 June 2004–2 July 2004; Volume 1, p. I. [Google Scholar]
- Yang, J.; Wright, J.; Huang, T.; Ma, Y. Image super-resolution as sparse representation of raw image patches. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 24–26 June 2008; pp. 1–8. [Google Scholar]
- Yang, J.; Wright, J.; Huang, T.S.; Ma, Y. Image super-resolution via sparse representation. IEEE Trans. Image Process. 2010, 19, 2861–2873. [Google Scholar] [CrossRef] [PubMed]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Learning a deep convolutional network for image super-resolution. In Computer Vision—ECCV 2014, Proceedings of the 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 184–199. [Google Scholar]
- Kim, J.; Lee, J.K.; Lee, K.M. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654. [Google Scholar]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]
- Kim, J.; Lee, J.K.; Lee, K.M. Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1637–1645. [Google Scholar]
- Ying, T.; Jian, Y.; Liu, X. Image Super-Resolution via Deep Recursive Residual Network. In Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Yu, J.; Fan, Y.; Yang, J.; Xu, N.; Wang, Z.; Wang, X.; Huang, T. Wide activation for efficient and accurate image super-resolution. arXiv 2018, arXiv:1808.08718. [Google Scholar]
- Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar]
- Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Vasu, S.; Thekke Madam, N.; Rajagopalan, A. Analyzing perception-distortion tradeoff using enhanced perceptual super-resolution network. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Qiao, Y.; Change Loy, C. Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Wang, X.; Xie, L.; Dong, C.; Shan, Y. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 1905–1914. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Yang, F.; Yang, H.; Fu, J.; Lu, H.; Guo, B. Learning texture transformer network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5791–5800. [Google Scholar]
- Arnab, A.; Dehghani, M.; Heigold, G.; Sun, C.; Lučić, M.; Schmid, C. Vivit: A video vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 6836–6846. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Van Gool, L.; Timofte, R. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 1833–1844. [Google Scholar]
- Yan, C.; Shi, G.; Wu, Z. SMIR: A Transformer-Based Model for MRI super-resolution reconstruction. In Proceedings of the 2021 IEEE International Conference on Medical Imaging Physics and Engineering (ICMIPE), Hefei, China, 12–14 November 2021; pp. 1–6. [Google Scholar]
- Wang, Z.; Chen, J.; Hoi, S.C. Deep learning for image super-resolution: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3365–3387. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dutta, V.; Zielinska, T. Networking technologies for robotic applications. arXiv 2015, arXiv:1505.07593. [Google Scholar]
- Nagadomi. waifu2x. Available online: http://waifu2x.udp.jp (accessed on 18 October 2022).
- Matsui, Y.; Ito, K.; Aramaki, Y.; Fujimoto, A.; Ogawa, T.; Yamasaki, T.; Aizawa, K. Sketch-based manga retrieval using manga109 dataset. Multimed. Tools Appl. 2017, 76, 21811–21838. [Google Scholar] [CrossRef]
- Bevilacqua, M.; Roumy, A.; Guillemot, C.; Alberi-Morel, M.L. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In Proceedings of the 23rd British Machine Vision Conference (BMVC), Surrey, UK, 3–7 September 2012. [Google Scholar]
- Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2472–2481. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA, 3–6 December 2012. [Google Scholar]
- seeprettyface. Available online: seeprettyface.com (accessed on 18 October 2022).
- Pixiv. Available online: https://www.pixiv.net/ (accessed on 18 October 2022).
- Zeyde, R.; Elad, M.; Protter, M. On single image scale-up using sparse-representations. In Curves and Surfaces, Proceedings of the 7th International Conference, Avignon, France, 24–30 June 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 711–730. [Google Scholar]
- Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the Eighth IEEE International Conference on Computer Vision ICCV 2001, Vancouver, BC, Canada, 7–14 July 2001; Volume 2, pp. 416–423. [Google Scholar]
- Huang, J.B.; Singh, A.; Ahuja, N. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5197–5206. [Google Scholar]
- wandb. Available online: https://wandb.ai (accessed on 18 October 2022).
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the Advances in Neural Information Processing Systems 32 (NIPS 2019), Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Pulli, K.; Baksheev, A.; Kornyakov, K.; Eruhimov, V. Real-time computer vision with OpenCV. Commun. ACM 2012, 55, 61–69. [Google Scholar] [CrossRef]
- timm. Available online: https://github.com/rwightman/pytorch-image-models (accessed on 18 October 2022).
- Chen, X.; Wang, X.; Zhou, J.; Dong, C. Activating More Pixels in Image Super-Resolution Transformer. arXiv 2022, arXiv:2205.04437. [Google Scholar]
Anime Faces | Anime Characters | |||||||
---|---|---|---|---|---|---|---|---|
Max PSNR | Average PSNR | Max SSIM | Average SSIM | Max PSNR | Average PSNR | Max SSIM | Average SSIM | |
2× | 48.202 dB | 40.704 dB | 0.996 | 0.987 | 42.941 dB | 33.598 dB | 0.996 | 0.963 |
4× | 41.251 dB | 33.869 dB | 0.989 | 0.959 | 42.086 dB | 32.234 dB | 0.992 | 0.934 |
8× | 33.267 dB | 26.554 dB | 0.943 | 0.860 | 36.024 dB | 27.299 dB | 0.975 | 0.866 |
Methods | Size | Params | Runtime (Image/s) | PSNR | SSIM |
---|---|---|---|---|---|
AISR-XS | 128 | 7.21 M | 5.31 | 33.424 dB | 0.951 |
AISR-XW | 128 | 6.94 M | 5.68 | 33.246 dB | 0.954 |
AISR-XSW | 128 | 6.07 M | 6.06 | 32.390 dB | 0.947 |
AISR-O | 128 | 8.37 M | 4.59 | 33.869 dB | 0.959 |
AISR-XS | 256 | 7.21 M | 0.98 | 31.952 dB | 0.936 |
AISR-XW | 256 | 6.94 M | 1.06 | 31.864 dB | 0.937 |
AISR-XSW | 256 | 6.07 M | 1.19 | 31.112 dB | 0.930 |
AISR-O | 256 | 8.37 M | 0.45 | 32.175 dB | 0.940 |
Method | 2× | 4× | 8× | |||
---|---|---|---|---|---|---|
PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | |
Bicubic [4] | 28.973 dB | 0.933 | 25.860 dB | 0.828 | 20.548 dB | 0.622 |
Waifu2x [32] | 36.166 dB | 0.978 | 30.088 dB | 0.930 | 24.439 dB | 0.805 |
RealESRGAN [23] | 29.657 dB | 0.983 | 28.963 dB | 0.935 | 22.987 dB | 0.783 |
SwinIR [28] | 39.553 dB | 0.985 | 32.705 dB | 0.950 | 26.016 dB | 0.845 |
Our | 40.704 dB | 0.987 | 33.869 dB | 0.959 | 26.554 dB | 0.860 |
Method | 2× | 4× | 8× | |||
---|---|---|---|---|---|---|
PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | |
Bicubic [4] | 33.071 dB | 0.951 | 27.693 dB | 0.876 | 24.669 dB | 0.806 |
Waifu2x [32] | 38.536 dB | 0.972 | 31.834 dB | 0.934 | 26.974 dB | 0.867 |
RealESRGAN [23] | 32.136 dB | 0.954 | 29.612 dB | 0.925 | 25.733 dB | 0.852 |
SwinIR [28] | 38.460 dB | 0.971 | 32.467 dB | 0.940 | 27.601 dB | 0.878 |
Our | 39.137 dB | 0.973 | 33.081 dB | 0.943 | 27.873 dB | 0.883 |
Method | 2× | 4× | 8× | |||
---|---|---|---|---|---|---|
PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | |
Bicubic [4] | 31.531 dB | 0.946 | 26.328 dB | 0.850 | 23.331 dB | 0.759 |
Waifu2x [32] | 37.524 dB | 0.975 | 30.975 dB | 0.932 | 25.878 dB | 0.846 |
RealESRGAN [23] | 31.150 dB | 0.953 | 28.394 dB | 0.914 | 24.502 dB | 0.819 |
SwinIR [28] | 37.184 dB | 0.973 | 31.439 dB | 0.934 | 26.661 dB | 0.859 |
Our | 37.990 dB | 0.975 | 32.175 dB | 0.940 | 27.026 dB | 0.865 |
Methods | Test Dataset | Runtime (Images/s) 2× | Runtime (Images/s) 4× | Runtime (Images/s) 8× |
---|---|---|---|---|
Bicubic | AnimeFace180 | 92.78 | 104.65 | 111.11 |
Waifu2x | AnimeFace180 | 19.35 | 18.87 | 18.15 |
Real-ESRGAN | AnimeFace180 | 2.62 | 2.23 | 1.38 |
SwinIR | AnimeFace180 | 0.95 | 2.24 | 8.57 |
Ours | AnimeFace180 | 0.391 | 4.59 | 6.71 |
Bicubic | AnimeCharacter12 | 36.92 | 40.54 | 43.01 |
Waifu2x | AnimeCharacter12 | 5.73 | 4.731 | 4.32 |
Real-ESRGAN | AnimeCharacter12 | 0.41 | 1.52 | 2.48 |
SwinIR | AnimeCharacter12 | 0.34 | 0.99 | 1.94 |
Ours | AnimeCharacter12 | 0.13 | 0.56 | 1.45 |
Bicubic | Multi-level anime83 | 31.32 | 36.08 | 38.21 |
Waifu2x | Multi-level anime83 | 5.69 | 4.89 | 5.42 |
Real-ESRGAN | Multi-level anime83 | 0.68 | 1.99 | 3.54 |
SwinIR | Multi-level anime83 | 0.22 | 1.35 | 1.69 |
Ours | Multi-level anime83 | 0.17 | 0.45 | 1.05 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, S.; Dutta, V.; He, X.; Matsumaru, T. A Transformer-Based Model for Super-Resolution of Anime Image. Sensors 2022, 22, 8126. https://doi.org/10.3390/s22218126
Xu S, Dutta V, He X, Matsumaru T. A Transformer-Based Model for Super-Resolution of Anime Image. Sensors. 2022; 22(21):8126. https://doi.org/10.3390/s22218126
Chicago/Turabian StyleXu, Shizhuo, Vibekananda Dutta, Xin He, and Takafumi Matsumaru. 2022. "A Transformer-Based Model for Super-Resolution of Anime Image" Sensors 22, no. 21: 8126. https://doi.org/10.3390/s22218126
APA StyleXu, S., Dutta, V., He, X., & Matsumaru, T. (2022). A Transformer-Based Model for Super-Resolution of Anime Image. Sensors, 22(21), 8126. https://doi.org/10.3390/s22218126