CutGAN: dual-Branch generative adversarial network for paper-cut image generation

Yuan Liao¹,
Lijun Yan¹,
Zeyu Hou¹,
Shujian Shi²,
Zhao’e Fu³ &
…
Yan Ma ORCID: orcid.org/0000-0003-4626-1401¹

257 Accesses
Explore all metrics

Abstract

Chinese paper-cutting, as an ancient folk art, is facing difficulties in preserving and passing down its traditions due to a lack of skilled paper-cut artists. In contrast to other image generation tasks, paper-cut images not only necessitate symmetry and exaggeration but also demand a certain level of resemblance to human facial features. To address these issues, this paper proposes a dual-branch generative adversarial network model for automatically generating paper-cut images, referred to as CutGAN. Specifically, we first construct a paper-cut dataset consisting of 891 pairs of facial images and handcrafted paper-cut images to train and evaluate CutGAN. Next, during the pre-training phase, we utilize gender and eyeglasses recognition tasks to train the fixed encoder. In the fine-tuning phase, we design a flexible encoder based on the modified U-net structure without skip connections. Furthermore, we introduce an average face loss to augment the diversity and improve the quality of the generated paper-cut images. We conducted extensive qualitative and quantitative experiments, as well as ablation experiments, comparing CutGAN with state-of-the-art baseline models on the test set. The experimental results indicate that CutGAN outperforms other image translation models by generating paper-cut images that more accurately capture the essence of Chinese paper-cut art and closely resemble actual facial images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Generating Artistic Portrait Drawings from Images

DLP-GAN: learning to draw modern Chinese landscape photos with generative adversarial network

Article 27 December 2023

CA-GAN: the synthesis of Chinese art paintings using generative adversarial networks

Article 16 October 2023

Data availability

The data used in this study are privately held and not publicly available.

References

Xu X, Liu S (2023) The Wonderful Art Hidden in the Boudoir—Analysis of the Paper-cut Art of Qiang Embroidery. Int J Front Sociol 5 (3). https://doi.org/10.25236/IJFS.2023.050301
Islam MR, Arafat E (2023) Exploring the Application of Paper-Cutting in Teaching Chinese as a Foreign Language: A Preliminary Study. Eur J Sci Innov Technol 3(1):219–223
Google Scholar
Karetzky PE (2022) Xin Song and Her Transformation of the Traditional Practice of Paper Cutting. The J Asian Arts Aesthet 8:75–92
Google Scholar
Liu D, Cui Y, Yan L, Mousas C, Yang B, Chen Y (2021) Densernet: Weakly supervised visual localization using multi-scale feature aggregation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 7. pp 6101–6109. https://doi.org/10.1609/aaai.v35i7.16760
Yan L, Wang Q, Ma S, Wang J, Yu C (2022) Solve the puzzle of instance segmentation in videos: A weakly supervised framework with spatio-temporal collaboration. IEEE Trans Circuits Syst Video Technol 33(1):393–406
Article Google Scholar
Cheng Z, Liang J, Choi H, Tao G, Cao Z, Liu D, Zhang X (2022) Physical attack on monocular depth estimation with optimal adversarial patches. European Conference on Computer Vision. Springer, pp 514–532
Google Scholar
Yan L, Ma S, Wang Q, Chen Y, Zhang X, Savakis A, Liu D (2022) Video captioning using global-local representation. IEEE Trans Circuits Syst Video Technol 32(10):6642–6656
Article Google Scholar
Chandaliya PK, Nain N (2023) AW-GAN: face aging and rejuvenation using attention with wavelet GAN. Neural Comput Appl 35(3):2811–2825
Article Google Scholar
Choi Y, Choi M, Kim M, Ha J-W, Kim S, Choo J (2018) StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 8789–8797. https://doi.org/10.48550/arXiv.1711.09020
Pandey N, Savakis A (2020) Poly-GAN: Multi-conditioned GAN for fashion synthesis. Neurocomputing 414:356–364
Article Google Scholar
Zhang P, Zhang B, Chen D, Yuan L, Wen F (2020) Cross-domain correspondence learning for exemplar-based image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Seattle, WA, pp 5143–5153
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Honolulu, HI, pp 1125–1134
Li B, Zhu Y, Wang Y, Lin C-W, Ghanem B, Shen L (2021) Anigan: Style-guided generative adversarial networks for unsupervised anime face generation. IEEE Trans Multimed 24:4077–4091
Article Google Scholar
Kim J, Kim M, Kang H, Lee K (2019) U-gat-it: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. arXiv preprint arXiv:190710830. https://doi.org/10.48550/arXiv.1907.10830
Shi Y, Deb D, Jain AK (2019) Warpgan: automatic caricature generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE, Long Beach, CA, pp 10762–10771
Back J (2021) Fine-tuning stylegan2 for cartoon face generation. arXiv preprint arXiv:210612445. https://doi.org/10.48550/arXiv.2106.12445
Peng X, Peng S, Hu Q, Peng J, Wang J, Liu X, Fan J (2022) Contour-enhanced CycleGAN framework for style transfer from scenery photos to Chinese landscape paintings. Neural Comput Appl 34(20):18075–18096
Article Google Scholar
Zhao J, Lee F, Hu C, Yu H, Chen Q (2022) LDA-GAN: Lightweight domain-attention GAN for unpaired image-to-image translation. Neurocomputing 506:355–368
Article Google Scholar
Yu J, Xu X, Gao F, Shi S, Wang M, Tao D, Huang Q (2020) Toward realistic face photo–sketch synthesis via composition-aided GANs. IEEE Trans Cybern 51(9):4350–4362
Article Google Scholar
Yan L, Han C, Xu Z, Liu D, Wang Q (2022) Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration for Video Captioning. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI). International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.24963/ijcai.2023/180
Cheng Z, Liang J, Tao G, Liu D, Zhang X (2023) Adversarial training of self-supervised monocular depth estimation against physical-world attacks. arXiv preprint arXiv:230113487. https://doi.org/10.48550/arXiv.2301.13487
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63(11):139–144
Article MathSciNet Google Scholar
Li M, Huang H, Ma L, Liu W, Zhang T, Jiang Y (2018) Unsupervised image-to-image translation with stacked cycle-consistent adversarial networks. In: Proceedings of the European conference on computer vision (ECCV). ECCV, pp 184–199. https://doi.org/10.1007/978-3-030-01240-3_12
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:171010196. https://doi.org/10.48550/arXiv.1710.10196
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:14111784. https://doi.org/10.48550/arXiv.1411.1784
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision. IEEE, Venice, Italy, pp 2223–2232
Zhao J, Ma Y, Yong K, Zhu M, Wang Y, Luo Z, Wei X, Huang X (2023) Deep-learning-based automatic evaluation of rice seed germination rate. J Sci Food Agric 103(4):1912–1924
Article Google Scholar
Zhao J, Ma Y, Yong K, Zhu M, Wang Y, Wang X, Li W, Wei X, Huang X (2023) Rice seed size measurement using a rotational perception deep learning model. Comput Electron Agric 205:107583
Article Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18. Springer, pp 234–241
Google Scholar
Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Salt Lake City, UT, pp 8798–8807
Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. Adv Neural Inform Process Syst 30. https://doi.org/10.48550/arXiv.1711.11586
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE, pp 4401–4410. https://doi.org/10.1109/TPAMI.2020.2970919
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV). IEEE, pp 172–189. https://doi.org/10.1007/978-3-030-01219-9_11
Liang H, Fu W, Yi F (2019) A survey of recent advances in transfer learning. In: 2019 IEEE 19th international conference on communication technology (ICCT), IEEE, pp 1516–1523
Wang Y, Wu C, Herranz L, Van de Weijer J, Gonzalez-Garcia A, Raducanu B (2018) Transferring gans: generating images from limited data. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 218–234. https://doi.org/10.1007/978-3-030-01231-1_14
Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Honolulu, HI, pp 7167–7176
Cai G, Wang Y, He L, Zhou M (2019) Unsupervised domain adaptation with adversarial residual transform networks. IEEE Trans Neural Netw Learn Syst 31(8):3073–3086
Article Google Scholar
Wang Y, Gonzalez-Garcia A, Berga D, Herranz L, Khan FS, Weijer JVD (2020) Minegan: effective knowledge transfer from gans to target domains with few images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Seattle, WA, pp 9332–9341
Xiong L, Karlekar J, Zhao J, Cheng Y, Xu Y, Feng J, Pranata S, Shen S (2017) A good practice towards top performance of face recognition: transferred deep feature fusion. arXiv preprint arXiv:170400438. https://doi.org/10.48550/arXiv.1704.00438
Mo S, Cho M, Shin J (2020) Freeze the discriminator: a simple baseline for fine-tuning gans. arXiv preprint arXiv:200210964. https://doi.org/10.48550/arXiv.2002.10964
Ojha U, Li Y, Lu J, Efros AA, Lee YJ, Shechtman E, Zhang R (2021) Few-shot image generation via cross-domain correspondence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Nashville, TN, pp 10743–10752
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Las Vegas, NV, pp 770–778
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision. IEEE, Santiago, Chile, pp 3730–3738
King DE (2009) Dlib-ml: A machine learning toolkit. The J Mach Learn Res 10:1755–1758
Google Scholar
Talebi H, Milanfar P (2018) NIMA: Neural image assessment. IEEE Trans Image Process 27(8):3998–4011
Article MathSciNet Google Scholar
Murray N, Marchesotti L, Perronnin F (2012) AVA: a large-scale database for aesthetic visual analysis. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, Providence, RI, pp 2408–2415
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, vol 30. Curran Associates Inc., Red Hook, NY
Bińkowski M, Sutherland DJ, Arbel M, Gretton A (2018) Demystifying MMD gans. arXiv preprint arXiv:180101401. https://doi.org/10.48550/arXiv.1801.01401
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Salt Lake City, UT, pp 586–595
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, Miami, FL, pp 248–255
Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B (2022) High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE, New Orleans, LA, pp 10684–10695

Download references

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China (Grant no. 61373004).

Author information

Authors and Affiliations

College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai, China
Yuan Liao, Lijun Yan, Zeyu Hou & Yan Ma
College of Artificial Intelligence, Shanghai Normal University Tianhua College, Shanghai, China
Shujian Shi
Ningxia Art Alliance Cultural and Artistic Products Co., Ltd., Ningxia, China
Zhao’e Fu

Authors

Yuan Liao
View author publications
You can also search for this author in PubMed Google Scholar
Lijun Yan
View author publications
You can also search for this author in PubMed Google Scholar
Zeyu Hou
View author publications
You can also search for this author in PubMed Google Scholar
Shujian Shi
View author publications
You can also search for this author in PubMed Google Scholar
Zhao’e Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Lijun Yan or Yan Ma.

Ethics declarations

Ethical Approval

The use of facial images in this research has been conducted with the informed consent of the individuals depicted.

Conflict of interests

The authors declare that there is no conflict of interests regarding the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liao, Y., Yan, L., Hou, Z. et al. CutGAN: dual-Branch generative adversarial network for paper-cut image generation. Multimed Tools Appl 83, 55867–55888 (2024). https://doi.org/10.1007/s11042-023-17746-z

Download citation

Received: 18 October 2023
Revised: 13 November 2023
Accepted: 25 November 2023
Published: 02 December 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17746-z

CutGAN: dual-Branch generative adversarial network for paper-cut image generation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Generating Artistic Portrait Drawings from Images

DLP-GAN: learning to draw modern Chinese landscape photos with generative adversarial network

CA-GAN: the synthesis of Chinese art paintings using generative adversarial networks

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Ethical Approval

Conflict of interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

CutGAN: dual-Branch generative adversarial network for paper-cut image generation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Generating Artistic Portrait Drawings from Images

DLP-GAN: learning to draw modern Chinese landscape photos with generative adversarial network

CA-GAN: the synthesis of Chinese art paintings using generative adversarial networks

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Ethical Approval

Conflict of interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now