More Web Proxy on the site http://driver.im/

research-article

Few-shot image generation based on contrastive meta-learning generative adversarial network

Authors:

Aniwat Phaphuangwittayakul,

Nopasit ChakpitakAuthors Info & Claims

The Visual Computer, Volume 39, Issue 9

Pages 4015 - 4028

https://doi.org/10.1007/s00371-022-02566-3

Published: 21 July 2022 Publication History

Abstract

Traditional deep generative models rely on enormous training data for generating images from a given class. However, they face the challenges associated with expensive and time-consuming in data acquisition as well as the requirements for fast learning from limited data of new categories. In this study, a contrastive meta-learning generative adversarial network (CML-GAN) is proposed to generate novel images of unseen classes from a few images by applying a self-supervised contrastive learning strategy to a fast adaptive meta-learning framework. By introducing a meta-learning framework into a GAN-based model, our model can efficiently learn the feature representations and quickly adapt to new generation tasks with only a few samples. The proposed model takes original input and generated images from the GAN-based model as inputs and evaluates both contrastive loss and distance loss based on the feature representations of the inputs extracted from the encoder. The original input image and its generated version from the generator are considered a positive pair, while the rest of the generated images in the same batch are considered negative samples. Then, the model converges to differentiate positive samples from negative ones and learns to generate distinct representations of the same samples, which prevents model overfitting. Thus, our model can generalize to generate diverse images from only a few samples of unseen categories, while fast adapting to new image generation tasks. Furthermore, the effectiveness of our model is demonstrated through extensive experiments on three datasets.

References

[1]

Van Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: International Conference on Machine Learning, pp. 1747–1756 (2016)

[2]

Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows. In: International Conference on Machine Learning (2015)

[3]

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, and Bengio Y Generative adversarial nets Adv. Neural Inf. Process. Syst. 2014 27 2672-2680

[4]

Li H, Zhong Z, Guan W, Du C, Yang Y, Wei Y, and Ye C Generative character inpainting guided by structural information Vis. Comput. 2021 37 1-12

[5]

Li L, Tang J, Ye Z, Sheng B, Mao L, and Ma L Unsupervised face super-resolution via gradient enhancement and semantic guidance Vis. Comput. 2021 37 1-13

[6]

Kingma, D.P., Welling, M.: Auto-Encoding Variational Bayes. CoRR arXiv:1312.6114 (2014)

[7]

Bartunov, S., Vetrov, D.: Few-shot generative modelling with generative matching networks. In: International Conference on Artificial Intelligence and Statistics, pp. 670–678 (2018)

[8]

Clouâtre, L., Demers, M.: FIGR: few-shot image generation with reptile. CoRR (2019)

[9]

Liang, W., Liu, Z., Liu, C.: DAWSON: a domain adaptive few shot generation framework. CoRR arXiv:2001.00576 (2020)

[10]

Phaphuangwittayakul A, Guo Y, and Ying F Fast adaptive meta-learning for few-shot image generation IEEE Trans. Multimed. 2021 24 2205-2217

[11]

Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, and Makedon F A survey on contrastive self-supervised learning Technologies 2021 9 1 2

[12]

Wang, Y., Wu, X.-M., Li, Q., Gu, J., Xiang, W., Zhang, L., Li, V.O.K.: Large margin few-shot learning. CoRR arXiv:1807.02872 (2018)

[13]

Xiao, C., Madapana, N., Wachs, J.: One-shot image recognition using prototypical encoders with reduced hubness. In: Proceedings of IEEE/CVF Winter Conference on Applied Computing and Vision, pp. 2252–2261 (2021)

[14]

Andrychowicz, M., Denil, M., Colmenarejo, S.G., Hoffman, M.W., Pfau, D., Schaul, T., de Freitas, N.: Learning to learn by gradient descent by gradient descent. CoRR arXiv:1606.04474 (2016)

[15]

Munkhdalai, T., Yu, H.: Meta networks. In: International Conference on Machine Learning, pp. 2554–2563 (2017)

[16]

Gidaris, S., Komodakis, N.: Dynamic few-shot visual learning without forgetting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4367–4375 (2018)

[17]

Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135 (2017)

[18]

Nichol, A., Achiam, J., Schulman, J.: On First-Order Meta-Learning Algorithms. CoRR arXiv:1803.02999 (2018)

[19]

Jamal, M.A., Qi, G.-J.: Task agnostic meta-learning for few-shot learning. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11719–11727 (2019)

[20]

Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations. OpenReview.net (2017)

[21]

Lake, B., Salakhutdinov, R., Gross, J., Tenenbaum, J.: One shot learning of simple visual concepts. In: Proceedings of Annual Meeting of the Cognitive Science Society, vol. 33, No. 33 (2011)

[22]

Rezende, D.J., Mohamed, S., Danihelka, I., Gregor, K., Wierstra, D.: One-shot generalization in deep generative models. arXiv preprint arXiv:1603.05106 (2016)

[23]

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

[24]

Antoniou, A., Storkey, A.J., Edwards, H.: Data augmentation generative adversarial networks. CoRR arXiv:1711.04340 (2017)

[25]

Hong, Y., Niu, L., Zhang, J., Zhang, L.: MatchingGAN: matching-based few-shot image generation. In: 2020 IEEE International Conference on Multimedia Expo, pp. 1–6 (2020)

[26]

Hong, Y., Niu, L., Zhang, J., Zhao, W., Fu, C., Zhang, L.: F2GAN: fusing-and-filling GAN for few-shot image generation. In: Proceedings of 28th ACM International Conference on Multimedia, pp. 2535–2543 (2020)

[27]

van den Oord, A., Li, Y., Vinyals, O.: Representation Learning with Contrastive Predictive Coding. CoRR arXiv:1807.03748 (2018)

[28]

Li, J., Zhou, P., Xiong, C., Hoi, S.C.H.: Prototypical contrastive learning of unsupervised representations. In: International Conference on Learning Representations. OpenReview.net (2021)

[29]

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607 (2020)

[30]

Tian, Y., Krishnan, D., Isola, P.: Contrastive multiview coding. In: European Conference on Computer Vision, vol. 12356, pp. 776–794. Springer (2020)

[31]

Wang, J., Wang, Y., Liu, S., Li, A.: Few-shot fine-grained action recognition via bidirectional attention and contrastive meta-learning. CoRR arXiv:2108.06647 (2021)

[32]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)

[33]

LeCun, Y., Cortes, C.: MNIST handwritten digit database. AT&T Labs. Available http://yann.lecun.com/exdb/mnist (2010)

[34]

Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face and Gesture Recognitions (FG 2018), pp. 67–74. IEEE (2018)

[35]

Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard GaN. In: International Conference on Learning Representations (2019)

[36]

Mao, Q., Lee, H.Y., Tseng, H.Y., Ma, S., Yang, M.H.: Mode seeking generative adversarial networks for diverse image synthesis. In: Proceedings of IEEE Computer Society Conference Computer Vision and Pattern Recognitions, pp. 1429–1437 (2019)

[37]

Heusel M, Ramsauer H, Unterthiner T, Nessler B, and Hochreiter S Gans trained by a two time-scale update rule converge to a local nash equilibrium Adv. Neural Inf. Process. Syst. 2017 30 6626-6637

[38]

Xu, Q., Huang, G., Yuan, Y., Guo, C., Sun, Y., Wu, F., Weinberger, K.Q.: An empirical study on evaluation metrics of generative adversarial networks. CoRR arXiv:1806.07755 (2018)

[39]

Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognitions, pp. 1199–1208 (2018)

[40]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognitions, pp. 770–778 (2016)

[41]

Varghese, D., Tamaddoni-Nezhad, A., Moschoyiannis, S., Fodor, P., Vanthienen, J., Inclezan, D., Nikolov, N.: One-shot rule learning for challenging character recognition. RuleML+ RR, pp. 10–27 (2020)

[42]

Lake BM, Salakhutdinov R, and Tenenbaum JB Human-level concept learning through probabilistic program induction Science (80-) 2015 350 6266 1332-1338

[43]

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

[44]

Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of ICML, vol. 30, No. 1, p. 3 (2013)

[45]

Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (ICLR) (2018)

Cited By

Zhao CCai WYuan Z(2025)Spectral normalization and dual contrastive regularization for image-to-image translationThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-024-03314-541:1(129-140)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s00371-024-03314-5
Tong WChu XLi ZTan LZhao JPan F(2024)Generative adversarial meta-learning knowledge graph completion for large-scale complex knowledge graphsJournal of Intelligent Information Systems10.1007/s10844-024-00860-162:6(1685-1701)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s10844-024-00860-1
Wang LWang XLi BXia R(2024)A fast-training GAN for coal–gangue image augmentation based on a few samplesThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-03192-340:9(6671-6687)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s00371-023-03192-3
Show More Cited By

Recommendations

Efficient Few-Shot Image Generation via Lightweight Octave Generative Adversarial Networks
Image and Graphics
Abstract
Generating high-quality images from few-shot image is always a challenging task. FastGAN achieves great success in few-shot image generation task by using a lightweight network structure with incorporating a self-supervised discriminator and skip-...
Lightweight dual-path octave generative adversarial networks for few-shot image generation
Abstract
Generative Adversarial Networks (GANs) can synthesize high-quality images by estimating the latent distribution of adversarial learning samples. However, GAN-based methods often suffer from severe overfitting when confronted with limited training ...
Contrastive Prototype-Guided Generation for Generalized Zero-Shot Learning
Abstract
Generalized zero-shot learning (GZSL) aims to recognize both seen and unseen classes, while only samples from seen classes are available for training. The mainstream methods mitigate the lack of unseen training data by simulating the visual ...
Highlights
- Prototype-guided generation is proposed in PGZSL for relieving class bias in generator.
- Certainty-Driven Mixup is proposed in PGZSL for suppressing boundary data generation.
- Empirical results show the effectiveness of PGZSL for ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Visual Computer: International Journal of Computer Graphics

The Visual Computer: International Journal of Computer Graphics Volume 39, Issue 9

Sep 2023

537 pages

ISSN:0178-2789

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 21 July 2022

Accepted: 30 May 2022

Author Tags

Qualifiers

Research-article

Funding Sources

National Key Research and Development Program of China
Science and Technology Committee of Shanghai Municipality (STCSM)
Science and Technology Committee of Shanghai Municipality (STCSM)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao CCai WYuan Z(2025)Spectral normalization and dual contrastive regularization for image-to-image translationThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-024-03314-541:1(129-140)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s00371-024-03314-5
Tong WChu XLi ZTan LZhao JPan F(2024)Generative adversarial meta-learning knowledge graph completion for large-scale complex knowledge graphsJournal of Intelligent Information Systems10.1007/s10844-024-00860-162:6(1685-1701)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s10844-024-00860-1
Wang LWang XLi BXia R(2024)A fast-training GAN for coal–gangue image augmentation based on a few samplesThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-03192-340:9(6671-6687)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s00371-023-03192-3
Zou KFaisan SHeitz FEpain MCroisille PFanton LValette S(2023)Disentangled representations: towards interpretation of sex determination from hip boneThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-022-02755-039:12(6673-6687)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s00371-022-02755-0
Chen JLi XOu YHu XPeng T(2023)MARANet: Multi-scale Adaptive Region Attention Network for Few-Shot LearningAdvances in Computer Graphics10.1007/978-3-031-50069-5_34(415-426)Online publication date: 28-Aug-2023
https://dl.acm.org/doi/10.1007/978-3-031-50069-5_34

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents