[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Non-episodic Variational Weight Imprinting for Few-Shot Learning

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Few-shot learning methods enable models to recognize novel classes from one or only a few examples. Most of the existing methods use episodic training which require multiple training models and/or variable memory requirements. We propose a non-episodic model, which combines a \(\beta\)-variational auto encoder (\(\beta\)-VAE) with the cosine similarity classifier (CSC). It is trained only once with fixed memory requirement and can be used for all test settings. We have also created our own printed character dataset with style attributes that includes six Indic scripts to analyze the effect of \(\beta\)-VAE’s disentangled content on few-shot classification accuracy. In our model, the encoder of \(\beta\)-VAE generates both the content and style distribution where only the content is used by the CSC. The network is trained in an end-to-end fashion which enables the encoder to disentangle the content features needed for classification from style attributes. After the training phase, the weights of novel classes are generated by normalizing the encoder’s content representation and used for testing. In case of Omniglot, we got 98.91% classification accuracy for 5-way 1-shot which is on par/better than the state-of-the art episodic methods. Our ablation study shows that \(\beta\)-VAE with CSC network outperforms CSC network on Indic script dataset. Our proposed model which is trained only once with fixed memory requirement gives comparable results with episodic and non-episodic methods as the encoder feeds the disentangled content features to the classifier. In a few settings, it also outperforms the episodic methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

The UHIndicPCwS dataset that we have created will be made available from April 3rd, 2023 at https://scis.uohyd.ac.in/~chakcs/datasets/UHIndicPCwS.html. The other standard dataset that is used in this paper is also publicly available.

Notes

  1. For higher values of \(N(\>50)\) we ran 100 test episodes due to resource limitations.

References

  1. Bengio Y. Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML workshop on unsupervised and transfer learning. 2012;17–36. JMLR workshop and conference proceedings.

  2. Yosinski J, Clune J, Bengio Y, Lipson H. How transferable are features in deep neural networks? Adv Neural Inf Process Syst. 2014;27:3320–8.

    Google Scholar 

  3. Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM. Learning to compare: relation network for few-shot learning. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. 2018;1199–208. https://doi.org/10.1109/CVPR.2018.00131.

  4. Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Advances in neural information processing systems. NIPS’17. Curran Associates Inc., Red Hook; 2017. p. 4080–90.

  5. Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proceedings of the 30th international conference on neural information processing systems. NIPS’16, 2016;3637–45. Curran Associates Inc., Red Hook.

  6. Chen W-Y, Liu Y-C, Kira Z, Wang Y-CF, Huang J-B. A closer look at few-shot classification. In: International conference on learning representations. 2019. https://openreview.net/forum?id=HkxLXnAcFQ.

  7. Qi H, Brown M, Lowe DG. Low-shot learning with imprinted weights. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. 2018;5822–30. https://doi.org/10.1109/CVPR.2018.00610.

  8. Gidaris S, Komodakis N. Dynamic few-shot visual learning without forgetting. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. 2018;4367–75. https://doi.org/10.1109/CVPR.2018.00459.

  9. Gidaris S, Komodakis N. Generating classification weights with gnn denoising autoencoders for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019;21–30.

  10. Sun Q, Liu Y, Chua T-S, Schiele B. Meta transfer learning for few shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2019;403–12.

  11. Padma V, Bhagvati C. Non-episodic variational discriminative few-shot classifier. In: PReMI21: 9th international conference on pattern recognition and machine intelligence; 2021.

  12. Fei-Fei L, Fergus R, Perona P. One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell. 2006;28(4):594–611. https://doi.org/10.1109/TPAMI.2006.79.

    Article  Google Scholar 

  13. Lake B, Salakhutdinov R, Gross J, Tenenbaum J. One shot learning of simple visual concepts. In: Proceedings of the annual meeting of the cognitive science society. 2011;33.

  14. Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning. 2017;70:1126–35. PMLR, Sydney, Australia.

  15. Finn C, Xu K, Levine S. Probabilistic model-agnostic meta-learning. Adv Neural Inf Process Syst. 2018;31:25.

    Google Scholar 

  16. Jerfel G, Grant E, Griffiths T, Heller KA. Reconciling meta-learning and continual learning with online mixtures of tasks. In: Advances in neural information processing systems. Montreal: Curran Associates Inc; 2019. p. 32.

    Google Scholar 

  17. Li X, Yu L, Fu C-W, Fang M, Heng P. Revisiting metric learning for few-shot image classification. Neurocomputing. 2020;406:49–58.

    Article  Google Scholar 

  18. Zhang C, Cai Y, Lin G, Shen C. Deepemd: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020; p. 12200–10. https://doi.org/10.1109/CVPR42600.2020.01222.

  19. Hoogeboom E. Few-shot classification by learning disentangled representations. Master’s thesis, University of Amsterdem (2017).

  20. Lee K, Maji S, Ravichandran A, Soatto S. Meta-learning with differentiable convex optimization. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR). 2019;10649–57. https://doi.org/10.1109/CVPR.2019.01091.

  21. Guo Y, Cheung N-M. Attentive weights generation for few shot learning via information maximization. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). 2020;13496–505. https://doi.org/10.1109/CVPR42600.2020.01351.

  22. Tian Y, Wang Y, Krishnan D, Tenenbaum JB, Isola P. Rethinking few-shot image classification: a good embedding is all you need? In: Computer vision-ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, Part XIV 16, 2020; p. 266–82.

  23. Wang Y, Chao W, Weinberger K, van der Maaten LS. Revisiting nearest-neighbor classification for few-shot learning. arXiv:1911.04623 (arXiv preprint) (2019).

  24. Liu C, Fu Y, Xu C, Yang S, Li J, Wang C, Zhang L. Learning a few-shot embedding model with contrastive learning. In: Proceedings of the AAAI conference on artificial intelligence. 2021;35:8635–43.

  25. Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A. beta-vae: Learning basic visual concepts with a constrained variational framework. In: ICLR 2017. OpenReview.net, Toulon, France; 2017.

  26. Kingma DP, Welling M. Auto-encoding variational bayes. arXiv:1312.6114 (arXiv preprint) (2013).

  27. Luo C, Zhan J, Xue X, Wang L, Ren R, Yang Q. Cosine normalization: using cosine similarity instead of dot product in neural networks. In: International conference on artificial neural networks. Springer. 2018;382–91.

  28. Edwards H, Storkey AJ. Towards a neural statistician. In: 5th international conference on learning representations, ICLR 2017. OpenReview.net, Toulon, France (2017). https://openreview.net/forum?id=HJDBUF5le.

  29. Hewitt LB, Nye MI, Gane A, Jaakkola TS, Tenenbaum JB. The variational homoencoder: Learning to learn high capacity generative models from few examples. In: Proceedings of the thirty-fourth conference on uncertainty in artificial intelligence, UAI 2018, pp. 988–997. AUAI Press, California, USA; 2018.

  30. Zhang J, Zhao C, Ni B, Xu M, Yang X. Variational few-shot learning. In: 2019 IEEE/CVF international conference on computer vision (ICCV), 2019; p. 1685–1694. https://doi.org/10.1109/ICCV.2019.00177.

Download references

Funding

This work is part of the research work leading to a PhD degree and is currently not funded by any source. There are no conflicts of interest and ethics approval is not applicable as the work is primarily in the software domain. The authors work in a university with emphasis on postgraduate research and teaching; therefore the consent for participation and publication are not required and not applicable. The Indic dataset created as a part of this work will be made available soon after testing it internally to eliminate any errors. The code is written by the research scholar and does not follow any software engineering standards. Therefore, its release requires substantial effort and may not be available soon. The first author is a research scholar and the second, her supervisor. The contributions are therefore equally shared.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Padma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Pattern Recognition and Machine Learning” guest edited by Ashish Ghosh, Monidipa Das and Anwesha Law.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Padma, V., Bhagvati, C. Non-episodic Variational Weight Imprinting for Few-Shot Learning. SN COMPUT. SCI. 4, 303 (2023). https://doi.org/10.1007/s42979-023-01732-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-01732-1

Keywords