Pyramid style-attentional network for arbitrary style transfer

Gaoming Yang¹,
Shicheng Zhang¹,
Xianjin Fang ORCID: orcid.org/0000-0002-3894-2007¹ &
…
Ji Zhang²

343 Accesses
Explore all metrics

Abstract

At present, the self-attention mechanism represented by the non-local network has been applied in style transfer widely. Models can achieve good style transfer effects by considering long-range dependencies between content images and style images while well maintaining semantic content information. However, the self-attention mechanism has to calculate the relationship between all positions between the content feature maps and style feature maps. The associated computational complexity of the mechanism is rather high, which will consume a lot of computing resources and adversely impact the efficiency of style transfer of high-resolution images. To solve this problem, we propose a novel Pyramid Style-attentional Network (PSANet) to reduce the computational complexity of the self-attention network by using pyramid pooling on feature maps. We compare our method with the vanilla style-attentional network in terms of speed and quality. The experimental results show that our model can significantly reduce the computational complexity and achieve good transfer effects. Especially for handling high-resolution images, the execution time of our method can reduce by \(34.7\%\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 11

Fig. 12

Arbitrary style transfer method with attentional feature distribution matching

Article 27 March 2024

FST-OAM: a fast style transfer model using optimized self-attention mechanism

Article 04 March 2024

Image style transfer with saliency constrained and SIFT feature fusion

Article 07 December 2024

References

Chen D, Yuan L, Liao J, Yu N, Hua G (2021) Explicit filterbank learning for neural image style transfer and image processing. IEEE transactions on pattern analysis and machine intelligence 43(7):2373–2387. https://doi.org/10.1109/tpami.2020.2964205
Article Google Scholar
Chen TQ, Schmidt M (2016) Fast patch-based style transfer of arbitrary style. arXiv preprint arXiv:1612.04337
Deng Y, Tang F, Dong W, Sun W, Huang F, Xu C (2020) Arbitrary style transfer via multi-adaptation network. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 2719–2727. https://doi.org/10.1145/3394171.3414015
Dumoulin V, Shlens J, Kudlur M (2016) A learned representation for artistic style. arXiv preprint arXiv:1610.07629
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2414–2423. https://doi.org/10.1109/CVPR.2016.265
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence 37(9):1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778. https://doi.org/10.1109/cvpr.2016.90
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1501–1510. https://doi.org/10.1109/ICCV.2017.167
Jing Y, Liu X, Ding Y, Wang X, Ding E, Song M, Wen S (2020) Dynamic instance normalization for arbitrary style transfer. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 4369–4376. https://doi.org/10.1609/aaai.v34i04.5862
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp 694–711. https://doi.org/10.1007/978-3-319-46475-6_43
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Li C, Wand M (2016) Combining markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2479–2486. https://doi.org/10.1109/CVPR.2016.272
Li X, Liu S, Kautz J, Yang M-H (2019) Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3809–3817.https://doi.org/10.1109/CVPR.2019.00393
Li Y, Fang C, Yang J, Wang Z, Lu X, Yang M-H (2017) Universal style transfer via feature transforms. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 385–395
Li Z, Zhou F, Yang L, Li X, Li J (2020) Accelerate neural style transfer with super-resolution. Multimedia Tools and Applications 79(7):4347–4364. https://doi.org/10.1007/s11042-018-6929-x
Article Google Scholar
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European Conference on Computer Vision, pp 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
Lin T, Ma Z, Li F, He D, Li X, Ding E, Wang N, Li J, Gao X (2021) Drafting and revision: Laplacian pyramid network for fast high-quality artistic style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5141–5150. https://doi.org/10.1109/cvpr46437.2021.00510
Liu S, Lin T, He D, Li F, Deng R, Li X, Ding E, Wang H (2021) Paint transformer: Feed forward neural painting with stroke prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6598–6607. https://doi.org/10.1109/iccv48922.2021.00653
Liu S, Lin T, He D, Li F, Wang M, Li X, Sun Z, Li Q, Ding E (2021) Adaattn: Revisit attention mechanism in arbitrary neural style transfer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6649–6658. https://doi.org/10.1109/iccv48922.2021.00658
Park DY, Lee KH (2019) Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5880–5888. https://doi.org/10.1109/CVPR.2019.00603
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32:8026–8037
Google Scholar
Phillips F, Mackintosh B (2011) Wiki art gallery, inc.: A case for critical thinking. Issues in Accounting Education 26(3):593–608. https://doi.org/10.2308/iace-50038
Sheng L, Lin Z, Shao J, Wang X (2018) Avatar-net: Multi-scale zero-shot style transfer by feature decoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8242–8250. https://doi.org/10.1109/CVPR.2018.00860
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in Neural Information Processing Systems, pp 5998–6008
Virtusio JJ, Ople JJM, Tan DS, Tanveer M, Kumar N, Hua K-L (2021) Neural style palette: A multimodal and interactive style transfer from a single style image. IEEE Transactions on Multimedia 23:2245–2258. https://doi.org/10.1109/tmm.2021.3087026
Article Google Scholar
Wang P, Li Y, Vasconcelos N (2021) Rethinking and improving the robustness of image style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 124–133. https://doi.org/10.1109/cvpr46437.2021.00019
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7794–7803. https://doi.org/10.1109/CVPR.2018.00813
Wang Z, Zhao L, Lin S, Mo Q, Zhang H, Xing W, Lu D (2020) Glstylenet: exquisite style transfer combining global and local pyramid features. IET Computer Vision 14(8):575–586. https://doi.org/10.1049/iet-cvi.2019.0844
Article Google Scholar
Wang Z, Zhao L, Chen H, Qiu L, Mo Q, Lin S, Xing W, Lu D (2020) Diversified arbitrary style transfer via deep feature perturbation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7789–7798. https://doi.org/10.1109/CVPR42600.2020.00781
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
Wu H, Sun Z, Zhang Y, Li Q (2019) Direction-aware neural style transfer with texture enhancement. Neurocomputing 370:39–55. https://doi.org/10.1016/j.neucom.2019.08.075
Article Google Scholar
Wu X, Hu Z, Sheng L, Xu D (2021) Styleformer: Real-time arbitrary style transfer via parametric style composition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 14618–14627. https://doi.org/10.1109/iccv48922.2021.01435
Yao Y, Ren J, Xie X, Liu W, Liu Y-J, Wang J (2019) Attention-aware multi-stroke style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1467–1475. https://doi.org/10.1109/CVPR.2019.00156
Yu B, Yang L, Chen F (2018) Semantic segmentation for high spatial resolution remote sensing images based on convolution neural network and pyramid pooling module. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11(9):3252–3261. https://doi.org/10.1109/JSTARS.2018.2860989
Article Google Scholar
Yu C, Liu Y, Gao C, Shen C, Sang N (2020) Representative graph neural network. In: European Conference on Computer Vision, pp 379–396. https://doi.org/10.1007/978-3-030-58571-6_23
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang M-H (2021) Restormer: Efficient transformer for high-resolution image restoration. arXiv preprint arXiv:2111.09881
Zhang Y, Fang C, Wang Y, Wang Z, Lin Z, Fu Y, Yang J (2019) Multimodal style transfer via graph cuts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 5943–5951. https://doi.org/10.1109/ICCV.2019.00604
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2881–2890. https://doi.org/10.1109/CVPR.2017.660
Zhu Z, Xu M, Bai S, Huang T, Bai X (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 593–602. https://doi.org/10.1109/ICCV.2019.00068

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Anhui Province of China under Grant No. 2008085MF220.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan, China
Gaoming Yang, Shicheng Zhang & Xianjin Fang
School of Mathematics, Physics and Computing, University of Southern Queensland, Queensland, Australia
Ji Zhang

Authors

Gaoming Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shicheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xianjin Fang
View author publications
You can also search for this author in PubMed Google Scholar
Ji Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianjin Fang.

Ethics declarations

Conflict of interest

We declare that we have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Gaoming Yang and Shicheng Zhang contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, G., Zhang, S., Fang, X. et al. Pyramid style-attentional network for arbitrary style transfer. Multimed Tools Appl 83, 13483–13502 (2024). https://doi.org/10.1007/s11042-023-15650-0

Download citation

Received: 28 December 2021
Revised: 15 March 2022
Accepted: 22 April 2023
Published: 06 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-15650-0

Pyramid style-attentional network for arbitrary style transfer

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Arbitrary style transfer method with attentional feature distribution matching

FST-OAM: a fast style transfer model using optimized self-attention mechanism

Image style transfer with saliency constrained and SIFT feature fusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Pyramid style-attentional network for arbitrary style transfer

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Arbitrary style transfer method with attentional feature distribution matching

FST-OAM: a fast style transfer model using optimized self-attention mechanism

Image style transfer with saliency constrained and SIFT feature fusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now