Abstract
Learning to recognize new concepts from few-shot examples is a long-standing challenge in modern computer vision. Metric based few-shot learning is a prevalent way towards this goal where new query instances are classified to the support classes by comparing the query instance with each class prototype. To this end, a representative and discriminative class prototype should necessarily be induced from several support example features. In this work, we propose a simple but effective hierarchical pooling induction method to learn such generalized class-level representations, by concatenating the max pooling and mean pooling operations. The proposed induction method could form a representative prototype for given few-shot samples, enhancing both the discrimination of the intermediate features and the final classification performance. The benchmark miniImageNet dataset and some other practical Remote Sensing Image Scene Classification (RESISC) datasets are employed for generalization investigation and it shows that the proposed induction module could improve the performance of state-of-the-art method and outperforms other alternative induction methods. Qualitative visualization and quantitative analysis are also provided to demonstrate the effectiveness and robustness of the proposed method.
Similar content being viewed by others
References
Battaglia P, Hamrick J B, Bapst V et al (2018) Relational inductive biases, deep learning, and graph networks[J]. arXiv:Learning
Cai Q, Pan Y, Yao T et al (2018) Memory matching networks for one-shot image recognition[C]. Comput Vis Pattern Recogn:4080–4088
Cheng G, Han J, Lu X et al (2017) Remote sensing image scene classification: benchmark and state of the art[J]. arXiv: Comput Vis Pattern Recogn 105 (10):1865–1883
Finn C, Abbeel P, Levine S et al (2017) Model-agnostic meta-learning for fast adaptation of deep networks[C]. Int Conf Mach Learn:1126–1135
Gao H, Shou Z, Zareian A et al (2018) Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks.[J] arXiv: Machine Learning
Garcia V, Bruna J. (2017) Few-Shot Learning with Graph Neural Networks.[J] arXiv: Machine Learning
Geng R, Li B, Li Y et al (2019) Few-Shot Text Classification with Induction Network.[J]. arXiv:Computation and Language
Grant E, Finn C, Levine S et al (2018) Recasting Gradient-Based Meta-Learning as Hierarchical Bayes[C]. International Conference on Learning Representations
Hariharan B, Girshick R B (2017) Low-shot visual recognition by shrinking and hallucinating features[C]. Int Conf Comput Vis:3037–3046
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition[C]. Comput Vis Pattern Recogn:770–778
Iyyer M, Manjunatha V, Boydgraber J L et al (2015) Deep unordered composition rivals syntactic methods for text classification[C]. Int Joint Conf Natural Lang Process:1681–1691
Joulin A, Grave E, Bojanowski P et al (2017) Bag of tricks for efficient text classification[C]. Conf Eur Chapter Assoc Comput Linguist:427–431
Kaiser L, Nachum O, Roy A et al (2017) Learning to Remember Rare Events[C]. International Conference on Learning Representations
Koch G, Zemel R, Salakhutdinov R. (2015) Siamese neural networks for one-shot image recognition[C]. ICML Deep Learning workshop
Krizhevsky A, Sutskever I, Hinton G E et al (2012) ImageNet Classification with Deep Convolutional Neural Networks[C]. Neural Inf Process Syst 141 (5):1097–1105
Lake B M, Salakhutdinov R, Tenenbaum J B et al (2015) Human-level concept learning through probabilistic program induction[J]. Science 350 (6266):1332–1338
Lecun Y, Bengio Y, Hinton G E et al (2015) Deep learning[J]. Nature 521(7553):436–444
Lee Y (2018) Choi S. Gradient-based meta-learning with learned layerwise metric and subspace[C]. Int Conf Mach Learn:2927–2936
Li Z, Zhou F, Chen F et al (2017) Meta-SGD: Learning to Learn Quickly for Few Shot Learning[J]. arXiv: Learning
Liu Y, Lee J, Park M et al (2019) Learning to propagate labels: transductive propagation network for few-shot learning[C]. International Conference on Learning Representations
Luca B, Joao F H, Jack V et al (2016) Learning feed-forward one-shot learners[C]. Neural Information Processing Systems
Munkhdalai T, Yu H. (2017) Meta Networks.[J] arXiv: Machine Learning
Ravi S, Larochelle H. (2017) Optimization as a Model for Few-Shot Learning[C]. International Conference on Learning Representations
Rusu A A, Rao D, Sygnowski J et al (2019) Meta-Learning with Latent Embedding Optimization[C]. International Conference on Learning Representations
Sabour S, Frosst N, Hinton G E et al (2017) Dynamic routing between capsules[C]. Neural Inf Process Syste:3856–3866
Salvaris M, Dean D, Tok W H et al (2018) Generative Adversarial Networks[J]. arXiv: Machine Learning, pp 187–208
Santoro A, Bartunov S, Botvinick M M et al (2016) One-shot Learning with Memory-Augmented Neural Networks[J]. arXiv: Learning
Schwartz E, Karlinsky L, Shtok J et al (2018) Delta-encoder: an effective sample synthesis method for few-shot object recognition.[J] arXiv: Comput Vis Pattern Recogn
Shen D, Wang G, Wang W et al (2018) Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms[C]. Meeting Assoc Comput Linguist:440–450
Snell J, Swersky K, Zemel RS et al (2017) Prototypical networks for few-shot learning[C]. Neural Inf Process Syst:4077–4087
Sung W Y, Jun S, Jaekyun M. (2019) TapNet: neural network augmented with task-adaptive projection for few-shot learning[C]. Int Conf Mach Learn:7115–7123
Sung F, Yang Y, Zhang L et al (2018) Learning to Compare: relation network for few-shot learning[C]. Comput Vis Pattern Recogn:1199–1208
The Google image dataset of SIRI-WHU, http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/Num/Google.html, Accessed 27 Sept 2018
Thomas E, Benedikt S, Jan H et al (2020) Meta-learning of neural architectures for few-shot learning[C]. Comput Vis Pattern Recogn, 12362–12372
Tyler R S, Karl R, Michael C M (2018) Adapted deep embeddings: a synthesis of methods for k-shot inductive transfer learning. Neural Information Processing Systems
Vinyals O, Blundell C, Lillicrap T P et al (2016) Matching networks for one shot learning[C]. Neural Inf Process Syst:3637–3645
Wang Y, Girshick R B, Hebert M et al (2018) Low-shot learning from imaginary data[C]. Comput Vis Pattern Recogn:7278–7286
Xia G, Hu J, Hu F et al (2017) AID: A benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Trans Geosci Remote Sens 55(7):3965–3981
Yan S, Zhang S, He X. (2019) A Dual Attention Network with Semantic Embedding for Few-Shot Learning[C] AAAI
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A:: Normal weighted induction
Appendix A:: Normal weighted induction
Definition:
For a set of discrete samples (x1, x2,⋯, xn), we first get the mean and variance of the samples, denoted as μ and σ, and then define a normal distribution f \(\sim \) N(μ, σ). The weight for each xi is defined as its probability density wi = f (xi) and their normalization values are
Finally, we define the normal weighted induction as:
For a multi-dimensional vector or tensor, we perform the bitwise operations over each dimension, resulting in a vector or tensor of the same dimension.
To qualitatively demonstrate the normal weighted induction, we present the visualization results compared with the unbiased average pooling. Figure 7 shows the 2D samples as well as their induction prototypes, where the red stars indicate the samples and the two induction prototypes are indicated by the cross and circle, respectively.
We can see that although the unbiased estimation it is, the mean value deviates far from the real sampling center (40, 60), due to the disturbance of some singular points. However, with the normal weighted induction, the prototype gets closer to the sampling center, alleviating the influence of the outliers. Conclusions could be made that the normal weighted induction will be more helpful in the cases that there are some obvious outliers.
Rights and permissions
About this article
Cite this article
Pan, C., Huang, J., Gong, J. et al. Few-shot learning with hierarchical pooling induction network. Multimed Tools Appl 81, 32937–32952 (2022). https://doi.org/10.1007/s11042-022-11999-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-11999-w