Non-episodic Variational Weight Imprinting for Few-Shot Learning

V. Padma¹ &
Chakravarthy Bhagvati¹

78 Accesses
Explore all metrics

Abstract

Few-shot learning methods enable models to recognize novel classes from one or only a few examples. Most of the existing methods use episodic training which require multiple training models and/or variable memory requirements. We propose a non-episodic model, which combines a \(\beta\)-variational auto encoder (\(\beta\)-VAE) with the cosine similarity classifier (CSC). It is trained only once with fixed memory requirement and can be used for all test settings. We have also created our own printed character dataset with style attributes that includes six Indic scripts to analyze the effect of \(\beta\)-VAE’s disentangled content on few-shot classification accuracy. In our model, the encoder of \(\beta\)-VAE generates both the content and style distribution where only the content is used by the CSC. The network is trained in an end-to-end fashion which enables the encoder to disentangle the content features needed for classification from style attributes. After the training phase, the weights of novel classes are generated by normalizing the encoder’s content representation and used for testing. In case of Omniglot, we got 98.91% classification accuracy for 5-way 1-shot which is on par/better than the state-of-the art episodic methods. Our ablation study shows that \(\beta\)-VAE with CSC network outperforms CSC network on Indic script dataset. Our proposed model which is trained only once with fixed memory requirement gives comparable results with episodic and non-episodic methods as the encoder feeds the disentangled content features to the classifier. In a few settings, it also outperforms the episodic methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Non-episodic Variational Discriminative Few-Shot Classifier

Leveraging Seen and Unseen Semantic Relationships for Generative Zero-Shot Learning

Generalized few-shot learning under large scope by using episode-wise regularizing imprinting

Article 26 September 2023

Data Availability

The UHIndicPCwS dataset that we have created will be made available from April 3rd, 2023 at https://scis.uohyd.ac.in/~chakcs/datasets/UHIndicPCwS.html. The other standard dataset that is used in this paper is also publicly available.

Notes

For higher values of \(N(\>50)\) we ran 100 test episodes due to resource limitations.

References

Bengio Y. Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML workshop on unsupervised and transfer learning. 2012;17–36. JMLR workshop and conference proceedings.
Yosinski J, Clune J, Bengio Y, Lipson H. How transferable are features in deep neural networks? Adv Neural Inf Process Syst. 2014;27:3320–8.
Google Scholar
Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM. Learning to compare: relation network for few-shot learning. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. 2018;1199–208. https://doi.org/10.1109/CVPR.2018.00131.
Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Advances in neural information processing systems. NIPS’17. Curran Associates Inc., Red Hook; 2017. p. 4080–90.
Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proceedings of the 30th international conference on neural information processing systems. NIPS’16, 2016;3637–45. Curran Associates Inc., Red Hook.
Chen W-Y, Liu Y-C, Kira Z, Wang Y-CF, Huang J-B. A closer look at few-shot classification. In: International conference on learning representations. 2019. https://openreview.net/forum?id=HkxLXnAcFQ.
Qi H, Brown M, Lowe DG. Low-shot learning with imprinted weights. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. 2018;5822–30. https://doi.org/10.1109/CVPR.2018.00610.
Gidaris S, Komodakis N. Dynamic few-shot visual learning without forgetting. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. 2018;4367–75. https://doi.org/10.1109/CVPR.2018.00459.
Gidaris S, Komodakis N. Generating classification weights with gnn denoising autoencoders for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019;21–30.
Sun Q, Liu Y, Chua T-S, Schiele B. Meta transfer learning for few shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2019;403–12.
Padma V, Bhagvati C. Non-episodic variational discriminative few-shot classifier. In: PReMI21: 9th international conference on pattern recognition and machine intelligence; 2021.
Fei-Fei L, Fergus R, Perona P. One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell. 2006;28(4):594–611. https://doi.org/10.1109/TPAMI.2006.79.
Article Google Scholar
Lake B, Salakhutdinov R, Gross J, Tenenbaum J. One shot learning of simple visual concepts. In: Proceedings of the annual meeting of the cognitive science society. 2011;33.
Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning. 2017;70:1126–35. PMLR, Sydney, Australia.
Finn C, Xu K, Levine S. Probabilistic model-agnostic meta-learning. Adv Neural Inf Process Syst. 2018;31:25.
Google Scholar
Jerfel G, Grant E, Griffiths T, Heller KA. Reconciling meta-learning and continual learning with online mixtures of tasks. In: Advances in neural information processing systems. Montreal: Curran Associates Inc; 2019. p. 32.
Google Scholar
Li X, Yu L, Fu C-W, Fang M, Heng P. Revisiting metric learning for few-shot image classification. Neurocomputing. 2020;406:49–58.
Article Google Scholar
Zhang C, Cai Y, Lin G, Shen C. Deepemd: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020; p. 12200–10. https://doi.org/10.1109/CVPR42600.2020.01222.
Hoogeboom E. Few-shot classification by learning disentangled representations. Master’s thesis, University of Amsterdem (2017).
Lee K, Maji S, Ravichandran A, Soatto S. Meta-learning with differentiable convex optimization. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR). 2019;10649–57. https://doi.org/10.1109/CVPR.2019.01091.
Guo Y, Cheung N-M. Attentive weights generation for few shot learning via information maximization. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). 2020;13496–505. https://doi.org/10.1109/CVPR42600.2020.01351.
Tian Y, Wang Y, Krishnan D, Tenenbaum JB, Isola P. Rethinking few-shot image classification: a good embedding is all you need? In: Computer vision-ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, Part XIV 16, 2020; p. 266–82.
Wang Y, Chao W, Weinberger K, van der Maaten LS. Revisiting nearest-neighbor classification for few-shot learning. arXiv:1911.04623 (arXiv preprint) (2019).
Liu C, Fu Y, Xu C, Yang S, Li J, Wang C, Zhang L. Learning a few-shot embedding model with contrastive learning. In: Proceedings of the AAAI conference on artificial intelligence. 2021;35:8635–43.
Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A. beta-vae: Learning basic visual concepts with a constrained variational framework. In: ICLR 2017. OpenReview.net, Toulon, France; 2017.
Kingma DP, Welling M. Auto-encoding variational bayes. arXiv:1312.6114 (arXiv preprint) (2013).
Luo C, Zhan J, Xue X, Wang L, Ren R, Yang Q. Cosine normalization: using cosine similarity instead of dot product in neural networks. In: International conference on artificial neural networks. Springer. 2018;382–91.
Edwards H, Storkey AJ. Towards a neural statistician. In: 5th international conference on learning representations, ICLR 2017. OpenReview.net, Toulon, France (2017). https://openreview.net/forum?id=HJDBUF5le.
Hewitt LB, Nye MI, Gane A, Jaakkola TS, Tenenbaum JB. The variational homoencoder: Learning to learn high capacity generative models from few examples. In: Proceedings of the thirty-fourth conference on uncertainty in artificial intelligence, UAI 2018, pp. 988–997. AUAI Press, California, USA; 2018.
Zhang J, Zhao C, Ni B, Xu M, Yang X. Variational few-shot learning. In: 2019 IEEE/CVF international conference on computer vision (ICCV), 2019; p. 1685–1694. https://doi.org/10.1109/ICCV.2019.00177.

Download references

Funding

This work is part of the research work leading to a PhD degree and is currently not funded by any source. There are no conflicts of interest and ethics approval is not applicable as the work is primarily in the software domain. The authors work in a university with emphasis on postgraduate research and teaching; therefore the consent for participation and publication are not required and not applicable. The Indic dataset created as a part of this work will be made available soon after testing it internally to eliminate any errors. The code is written by the research scholar and does not follow any software engineering standards. Therefore, its release requires substantial effort and may not be available soon. The first author is a research scholar and the second, her supervisor. The contributions are therefore equally shared.

Author information

Authors and Affiliations

School of Computer and Information Sciences, University of Hyderabad, Hyderabad, India
V. Padma & Chakravarthy Bhagvati

Authors

V. Padma
View author publications
You can also search for this author in PubMed Google Scholar
Chakravarthy Bhagvati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. Padma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Pattern Recognition and Machine Learning” guest edited by Ashish Ghosh, Monidipa Das and Anwesha Law.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Padma, V., Bhagvati, C. Non-episodic Variational Weight Imprinting for Few-Shot Learning. SN COMPUT. SCI. 4, 303 (2023). https://doi.org/10.1007/s42979-023-01732-1

Download citation

Received: 31 May 2022
Accepted: 08 February 2023
Published: 31 March 2023
DOI: https://doi.org/10.1007/s42979-023-01732-1

Non-episodic Variational Weight Imprinting for Few-Shot Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Non-episodic Variational Discriminative Few-Shot Classifier

Leveraging Seen and Unseen Semantic Relationships for Generative Zero-Shot Learning

Generalized few-shot learning under large scope by using episode-wise regularizing imprinting

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Non-episodic Variational Weight Imprinting for Few-Shot Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Non-episodic Variational Discriminative Few-Shot Classifier

Leveraging Seen and Unseen Semantic Relationships for Generative Zero-Shot Learning

Generalized few-shot learning under large scope by using episode-wise regularizing imprinting

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now