[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3552485.3554939acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Few-shot Food Recognition with Pre-trained Model

Published: 10 October 2022 Publication History

Abstract

Food recognition is a challenging task due to the diversity of food. However, conventional training in food recognition networks demands large amounts of labeled images, which is laborious and expensive. In this work, we aim to tackle the challenging few-shot food recognition problem by leveraging the knowledge learning from pre-trained models, e.g., CLIP. Although CLIP has shown a remarkable zero-shot capability on a wide range of vision tasks, it performs poorly in the domain-specific food recognition task. To transfer CLIP's rich prior knowledge, we explore an adapter-based approach to fine-tune CLIP with only a few samples. Thus we combine CLIP's prior knowledge with the new knowledge extracted from the few-shot training set effectively for achieving good performance. Besides, we also design appropriate prompts to facilitate more accurate identification of foods from different cuisines. Experiments demonstrate that our approach achieves quite promising performance on two public food datasets, including VIREO Food-172 and UECFood-256.

Supplementary Material

MP4 File (Few-shot Food Recognition with Pre-trained Model.mp4)
Presentation video of "Few-shot Food Recognition with Pre-trained Model".

References

[1]
Eduardo Aguilar, Marc Bolaños, and Petia Radeva. 2017. Food recognition using fusion of classifiers based on CNNs. In International Conference on Image Analysis and Processing. Springer, 213--224.
[2]
Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and JiaBin Huang. 2019. A closer look at few-shot classification. arXiv preprint arXiv:1904.04232 (2019).
[3]
Arkabandhu Chowdhury, Mingchao Jiang, Swarat Chaudhuri, and Chris Jermaine. 2021. Few-shot image classification: Just use a library of pre-trained feature extractors and a simple classifier. In ICCV. 9445--9454.
[4]
Stergios Christodoulidis, Marios Anthimopoulos, and Stavroula Mougiakakou. 2015. Food recognition for dietary assessment using deep convolutional neural networks. In International Conference on Image Analysis and Processing. Springer, 458--465.
[5]
Guneet S Dhillon, Pratik Chaudhari, Avinash Ravichandran, and Stefano Soatto. 2019. A baseline for few-shot image classification. arXiv preprint arXiv:1909.02729 (2019).
[6]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR (2021).
[7]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.
[8]
Yuqing Hu, Stéphane Pateux, and Vincent Gripon. 2022. Squeezing backbone feature distributions to the max for efficient few-shot learning. Algorithms 15, 5 (2022), 147.
[9]
Shuqiang Jiang, Weiqing Min, Linhu Liu, and Zhengdong Luo. 2019. Multi-scale multi-view deep feature aggregation for food recognition. IEEE Transactions on Image Processing 29 (2019), 265--276.
[10]
Chong-wah NGO Jing-jing Chen. 2016. Deep-based Ingredient Recognition for Cooking Recipe Retrival. ACM Multimedia (2016).
[11]
Y. Kawano and K. Yanai. 2014. Automatic Expansion of a Food Image Dataset Leveraging Existing Categories with Domain Adaptation. In Proc. of ECCV Workshop on Transferring and Adapting Source Knowledge in Computer Vision.
[12]
Niki Martinel, Gian Luca Foresti, and Christian Micheloni. 2018. Wide-slice residual networks for food recognition. In WACV. IEEE, 567--576.
[13]
Austin Meyers, Nick Johnston, Vivek Rathod, Anoop Korattikara, Alex Gorban, Nathan Silberman, Sergio Guadarrama, George Papandreou, Jonathan Huang, and Kevin P Murphy. 2015. Im2Calories: towards an automated mobile vision food diary. In ICCV. 1233--1241.
[14]
Paritosh Pandey, Akella Deepthi, Bappaditya Mandal, and Niladri B Puhan. 2017. FoodNet: Recognizing foods using ensemble of deep networks. IEEE Signal Processing Letters 24, 12 (2017), 1758--1762.
[15]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748--8763.
[16]
Yonglong Tian, Yue Wang, Dilip Krishnan, Joshua B Tenenbaum, and Phillip Isola. 2020. Rethinking few-shot image classification: a good embedding is all you need?. In ECCV(2020). Springer, 266--282.
[17]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[18]
Yan Wang, Wei-Lun Chao, Kilian Q Weinberger, and Laurens van der Maaten. 2019. Simpleshot: Revisiting nearest-neighbor classification for few-shot learning. arXiv preprint arXiv:1911.04623 (2019).
[19]
Hui Wu, Michele Merler, Rosario Uceda-Sosa, and John R Smith. 2016. Learning to make better mistakes: Semantics-aware visual food recognition. In Proceedings of the 24th ACM international conference on Multimedia. 172--176.
[20]
Shulin Yang, Mei Chen, Dean Pomerleau, and Rahul Sukthankar. 2010. Food recognition using statistics of pairwise local features. In CVPR. IEEE, 2249--2256.
[21]
Renrui Zhang, Rongyao Fang, Peng Gao, Wei Zhang, Kunchang Li, Jifeng Dai, Yu Qiao, and Hongsheng Li. 2021. Tip-adapter: Training-free clip-adapter for better vision-language modeling. arXiv preprint arXiv:2111.03930 (2021).
[22]
Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. 2021. Learning to prompt for vision-language models. arXiv preprint arXiv:2109.01134 (2021)

Cited By

View all
  • (2025)Learning complementary visual information for few-shot food recognition by Regional Erasure and ReactivationExpert Systems with Applications10.1016/j.eswa.2024.126174268(126174)Online publication date: Apr-2025
  • (2023)Ingredient Prediction via Context Learning Network With Class-Adaptive Asymmetric LossIEEE Transactions on Image Processing10.1109/TIP.2023.331895832(5509-5523)Online publication date: 1-Jan-2023

Index Terms

  1. Few-shot Food Recognition with Pre-trained Model

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CEA++ '22: Proceedings of the 1st International Workshop on Multimedia for Cooking, Eating, and related APPlications
      October 2022
      66 pages
      ISBN:9781450395038
      DOI:10.1145/3552485
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 October 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. few-shot learning
      2. food recognition
      3. transfer learning

      Qualifiers

      • Short-paper

      Funding Sources

      • Shanghai Pujiang Program
      • National Natural Science Foundation of China Project

      Conference

      MM '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 20 of 33 submissions, 61%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)42
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Learning complementary visual information for few-shot food recognition by Regional Erasure and ReactivationExpert Systems with Applications10.1016/j.eswa.2024.126174268(126174)Online publication date: Apr-2025
      • (2023)Ingredient Prediction via Context Learning Network With Class-Adaptive Asymmetric LossIEEE Transactions on Image Processing10.1109/TIP.2023.331895832(5509-5523)Online publication date: 1-Jan-2023

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media