Abstract
Few-shot segmentation (FSS) aims to segment novel classes given a small number of labeled samples. Most of the existing studies do not fine-tune the model during meta-testing, thus biasing the model towards the base classes and preventing the prediction of novel classes. Other studies only use support images for fine-tuning, which biases the model towards the support images rather than the target query images, especially when there is a large difference between the support and the query images. To alleviate these issues, we propose an \(\underline{{\textbf {e}}}\)fficient \(\underline{{\textbf {f}}}\)ine-\(\underline{{\textbf {t}}}\)uning network (EFTNet) that uses unlabeled query images and predicted pseudo labels to fine-tune the trained model parameters during meta-testing, which can bias the model towards the target query images. In addition, we design a query-to-support module, a support-to-query module, and a discrimination module to evaluate which fine-tuning round the model achieves optimal results. Moreover, the query-to-support module also takes the query images and their pseudo masks as part of the support images and support masks, which causes the prototypes to contain query information and tend to obtain better predictions. As a new meta-testing scheme, our EFTNet can be easily combined with existing studies and greatly improve their model performance without repeating the meta-training phase. Many experiments on PASCAL-\(5^i\) and COCO-\(20^i\) prove the effectiveness of our EFTNet. The EFTNet also achieves new state-of-the-art performance. Codes are available at https://github.com/Jiaguang-NEU/EFTNet.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability and Access
Two datasets are used: PASCAL-\(5^i\) (access: http://host.robots.ox.ac.uk/pascal/VOC/) and COCO-\(20^i\) (access: http://cocodataset.org/).
References
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
Huang Z et al (2023) CCNet: criss-cross attention for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 45(6):6896–6908
Ren W, Zhang J, Xu X, Ma L, Cao X, Meng G, Liu W (2019) Deep video dehazing with semantic segmentation. IEEE Trans Image Process 28(4):1895–1908
Deng J, Dong W, Socher R, Li L-J, Li K, Li F-F (2009) Imagenet: a large-scale hierarchical image database. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog. pp 248–255
Mittal S, Tatarchenko M, Brox T (2021) Semi-supervised semantic segmentation with high- and low-level consistency. IEEE Trans Pattern Anal Mach Intell 43(4):1369–1379
Castillo-Navarro J, Le Saux B, Boulch A et al (2022) Semi-supervised semantic segmentation in Earth observation: the MiniFrance suite, dataset analysis and multi-task network study. Mach Learn 111:3125–3160
Cao X, Chen H, Li Y, Peng Y, Wang S, Cheng L (2021) Uncertainty aware temporal-ensembling model for semi-supervised ABUS mass segmentation. IEEE Trans Med Imaging 40(1):431–443
Chaitanya K, Karani N, Baumgartner CF, Erdil E, Becker A, Donati O, Konukoglu E (2021) Semi-supervised task-driven data augmentation for medical image segmentation. Med Image Anal 68
Li Z, Liu M, Chen Y, Xu Y, Li W, Du Q (2022) Deep cross-domain few-shot learning for hyperspectral image classification. IEEE Trans Geosci Remote Sens 60:1–18
Bi S, Wang YX, Li XX, Dong M, Zhu JH (2021) Critical direction projection networks for few-shot learning. Appl Intell 52(5):5400–5413
Jiang W, Huang K, Geng J, Deng X (2021) Multi-scale metric learning for few-shot learning. IEEE Trans Circuits Syst Video Technol 31(3):1091–1102
Zheng Z, Feng X, Yu H, Li X, Gao M (2022) BDLA: bi-directional local alignment for few-shot learning. Appl Intell 53(1):769–785
Wang B, Li L, Verma M, Nakashima Y, Kawasaki R, Nagahara H (2022) Match them up: visually explainable few-shot image classification. Appl Intell
Yan L, Li F, Zhang L, Zheng X (2023) Discriminant space metric network for few-shot image classification. Appl Intell
Liu S, Shi Q, Zhang L (2021) Few-shot hyperspectral image classification with unknown classes using multitask deep learning. IEEE Trans Geosci Remote Sensing 59(6):5085–5102
Zhou X, Liang W, Shimizu S, Ma J, Jin Q (2021) Siamese neural network based few-shot learning for anomaly detection in industrial cyber-physical systems. IEEE Trans Ind Inform 17(8):5790–5798
Liu B, Jiao J, Ye Q (2021) Harmonic feature activation for few-shot semantic segmentation. IEEE Trans Image Process 30:3142–3153
Wang K, Liew JH, Zou Y, Zhou D, Feng J (2019) PANet: few-shot image semantic segmentation with prototype alignment. In: Proc. Int. Conf. Comput. Vis. pp 9197–9206
Iqbal E, Safarov S, Bang S (2022) MSANet: multi-similarity and attention guidance for boosting few-shot segmentation. arXiv:2206.09667v1. https://arxiv.org/pdf/2206.09667
Zhang S, Wu T, Wu S, Guo G (2022) CATrans: context and affinity transformer for few-shot segmentation. In: Proc Int Joint Conf Artif Intell
Lang C, Cheng G, Tu B, Han J (2022) Learning what not to segment: a new perspective on few-shot segmentation. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 8057–8067
Liu Y, Liu N, Cao Q, Yao X, Han J, Shao L (2022) Learning non-target knowledge for few-shot semantic segmentation. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 11573–11582
Tian Z, Zhao H, Shu M, Yang Z, Li R, Jia J (2022) Prior guided feature enrichment network for few-shot segmentation. IEEE Trans Pattern Anal Mach Intell 44(2):1050–1065
Zheng Z, Huang G, Yuan X, Pun C, Liu H, Ling W (2023) Quaternion-valued correlation learning for few-shot semantic segmentation. IEEE Trans Circuits Syst Video Technol 33(5):2102–2115
Chang Z, Lu Y, Ran X et al (2023) Simple yet effective joint guidance learning for few-shot semantic segmentation. Appl Intell 53:26603–26621
Lang C, Tu B, Cheng G, Han J (2022) Beyond the Prototype: divide-and-conquer proxies for few-shot segmentation. In: Proc Int Joint Conf Artif Intell
Gao G, Fang Z, Han C, Wei Y, Liu CH, Yan S (2022) DRNet: double recalibration network for few-shot semantic segmentation. Trans Image Process 31:6733–6746
Li G, Jampani V, Sevilla-Lara L, Sun D, Kim J, Kim J (2021) Adaptive prototype learning and allocation for few-shot segmentation. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 8334–8343
Liu B, Ding Y, Jiao J, Ji X, Ye Q (2021) Anti-aliasing semantic reconstruction for few-shot semantic segmentation. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 9747–9756
Lu Z, He S, Zhu X, Zhang L, Song Y-Z, Xiang T (2021) Simpler is better: few-shot semantic segmentation with classifier weight transformer. In: Proc Int Conf Comput Vis. pp 8741–8750
Nguyen K, Todorovic S (2019) Feature weighting and boosting for few-shot segmentation. In: Proc Int Conf Comput Vis. pp 622–631
Qi F, Wenjie P, Yu-Wing T, Chi-Keung T, (2022) Self-support few-shot semantic segmentation. In: Proc Eur Conf Comput Vis
Shaban A, Bansal S, Liu Z, Essa I, Boots B (2017) One-shot learning for semantic segmentation. arXiv:1709.03410. https://arxiv.org/abs/1709.03410
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Proc Int Conf Med Image Comput Comput-Assisted Intervention vol 9351. pp 234–241
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 2881–2890
O. Vinyals, C Blundell, T Lillicrap, K Kavukcuoglu, and D Wierstra, “Matching Networks for One Shot Learning,” in Proc. Adv. Neural Inform. Process. Syst., 2016, pp 3630–3638
Li D, Zhang J, Yang Y, Liu C, Song Y-Z, Hospedales T (2019) Episodic training for domain generalization. In: Proc Int Conf Comput Vis. pp 1446–1455
Xiao G, Tian S, Yu L, Zhou Z, Zeng X (2023) Siamese few-shot network: a novel and efficient network for medical image segmentation. Appl Intell
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 1199–1208
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proc Int Conf Mach Learn. pp 1126–1135
Jamal MA, Qi G-J (2019) Task agnostic meta-learning for few-shot learning. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 11719–11727
Chen Z, Fu Y, Chen K, Jiang Y-G (2019) Image block augmentation for one-shot learning. In: Proc AAAI Conf Artif Intell vol 33. pp 3379–3386
Chen Z, Fu Y, Wang Y-X, Ma L, Liu W, Hebert M (2019) Image deformation meta-networks for one-shot learning. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 8680–8689
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: Proc Int Conf Mach Learn
Shaban A, Bansal S, Liu Z, Essa I, Boots B (2017) One-shot learning for semantic segmentation. In: Proc Brit Mach Vis Conf
Zhang X, Wei Y, Yang Y, Huang TS (2018) Sg-one:similarity guidance network for one-shot semantic segmentation. arXiv:1810.09091
Liu W, Zhang C, Lin G, Liu F (2020) Crnet: cross-reference networks for few-shot segmentation. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 4165–4173
Everingham M, Gool LV, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Hariharan B, Arbeláez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In Proc Int Conf Comput Vis. pp 991–998
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Proc Eur Conf Comput Vis. pp 740–755
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc IEEE Conf Comput Vis Pattern Recog. pp 770–778
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. In: Proc Adv Neural Inf Process Syst. pp 8024–8035
Min J, Kang D, Cho M (2021) Hypercorrelation squeeze for few-shot segmentation. In: Proc Int Conf Comput Vis. pp 6941–6952
Snell J, Swersky K, Zemel R (2022) Dense cross-query-and-support attention weighted mask aggregation for few-shot segmentation. In: Proc Eur Conf Comput Vis. pp 151–168
Shi1 X, Wei D, Zhang Y, Lu D, Ning M, Chen J, Ma K, Zheng Y (2020) Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis vol 128. pp 336–359
Aggarwal AK, Jaidka P (2022) Segmentation of crop images for crop yield prediction. Int J Biol Biomed 7
Xiao J et al (2023) Enhancing assessment of corn growth performance using unmanned aerial vehicles (UAVs) and deep learning. Meas 214
Acknowledgements
This work is supported by National Nature Science Foundation of China (grant No.61871106 and No.61370152), Key R &D projects of Liaoning Province, China (grant No. 2020JH2/10100029), and the Open Project Program Foundation of the Key Laboratory of Opto-Electronics Information Processing, Chinese Academy of Sciences (OEIP-O-202002).
Author information
Authors and Affiliations
Contributions
Jiaguang Li: Methodology, Writing, and Experimental Design. Yubo Wang: Programming, Experimental Implementation. Zihan Gao: Programming, Experimental Implementation. Ying Wei: Investigation, Supervision, Validation.
Corresponding author
Ethics declarations
Conflict of Interest Statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical and Informed Consent for Data Used
The authors declare that the data used in this work are ethical and that they provided informed consent.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, J., Wang, Y., Gao, Z. et al. EFTNet: an efficient fine-tuning method for few-shot segmentation. Appl Intell 54, 9488–9507 (2024). https://doi.org/10.1007/s10489-024-05582-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05582-z