[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-030-49108-6_38guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Single Image-Based Food Volume Estimation Using Monocular Depth-Prediction Networks

Published: 19 July 2020 Publication History

Abstract

In this work, we present a system that can estimate food volume from a single input image, by utilizing the latest advancements in monocular depth estimation. We employ a state-of-the-art, monocular depth prediction network architecture, trained exclusively on videos, which we obtain from the publicly available EPIC-KITCHENS and our own collected food videos datasets. Alongside it, an instance segmentation network is trained on the UNIMIB2016 food-image dataset, to detect and produce segmentation masks for each of the different foods depicted in the given image. Combining the predicted depth map, segmentation masks and known camera intrinsic parameters, we generate three-dimensional (3D) point cloud representations of the target food objects and approximate their volumes with our point cloud-to-volume algorithm. We evaluate our system on a test set, consisting of images portraying various foods and their respective measured volumes, as well as combinations of foods placed in a single image.

References

[1]
U.S. Department of Agriculture, A.R.S.: FoodData central (2019). https://fdc.nal.usda.gov/
[2]
Almaghrabi, R., Villalobos, G., Pouladzadeh, P., Shirmohammadi, S.: A novel method for measuring nutrition intake based on food image. In: 2012 IEEE International Instrumentation and Measurement Technology Conference Proceedings, pp. 366–370. IEEE (2012)
[3]
Bossard L, Guillaumin M, and Van Gool L Fleet D, Pajdla T, Schiele B, and Tuytelaars T Food-101 – mining discriminative components with random forests Computer Vision – ECCV 2014 2014 Cham Springer 446-461
[4]
Chen, M., Dhingra, K., Wu, W., Yang, L., Sukthankar, R., Yang, J.: PFID: pittsburgh fast-food image dataset. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 289–292. IEEE (2009)
[5]
Ciocca G, Napoletano P, and Schettini R Food recognition: a new dataset, experiments, and results IEEE J. Biomed. Health Inform. 2016 21 3 588-598
[6]
Cordeiro, F., Bales, E., Cherry, E., Fogarty, J.: Rethinking the mobile food journal: exploring opportunities for lightweight photo-based capture. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3207–3216 (2015)
[7]
Cordeiro, F., et al.: Barriers and negative nudges: Exploring challenges in food journaling. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 1159–1162 (2015)
[8]
Damen D et al. Ferrari V, Hebert M, Sminchisescu C, Weiss Y, et al. Scaling egocentric vision: the epic-kitchens dataset Computer Vision – ECCV 2018 2018 Cham Springer 753-771
[9]
Dehais J, Anthimopoulos M, Shevchik S, and Mougiakakou S Two-view 3D reconstruction for food volume estimation IEEE Trans. Multimedia 2016 19 5 1090-1099
[10]
Edelsbrunner H and Harer J Computational Topology: An Introduction 2010 Providence American Mathematical Society
[11]
Ege T and Yanai K Image-based food calorie estimation using recipe information IEICE Tran. Inf. Syst. 2018 101 5 1333-1341
[12]
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
[13]
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 270–279 (2017)
[14]
Godard, C., Mac Aodha, O., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3838 (2019)
[15]
Hassannejad, H., Matrella, G., Ciampolini, P., De Munari, I., Mordonini, M., Cagnoni, S.: Food image recognition using very deep convolutional networks. In: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, pp. 41–49 (2016)
[16]
Hassannejad H, Matrella G, Ciampolini P, Munari ID, Mordonini M, and Cagnoni S A new approach to image-based estimation of food volume Algorithms 2017 10 2 66
[17]
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
[18]
International Food Information Council (IFIC) Foundation: 2019 Food and Health Survey (2019). https://foodinsight.org/wp-content/uploads/2019/05/IFIC-Foundation-2019-Food-and-Health-Report-FINAL.pdf
[19]
Liang, Y., Li, J.: Deep learning-based food calorie estimation method in dietary assessment. arXiv preprint arXiv:1706.04062 (2017)
[20]
Lin T-Y et al. Fleet D, Pajdla T, Schiele B, Tuytelaars T, et al. Microsoft COCO: common objects in context Computer Vision – ECCV 2014 2014 Cham Springer 740-755
[21]
Martinel, N., Foresti, G.L., Micheloni, C.: Wide-slice residual networks for food recognition. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 567–576. IEEE (2018)
[22]
Myers, A., et al.: Im2calories: towards an automated mobile vision food diary. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1233–1241 (2015)
[23]
Schoeller DA, Bandini LG, and Dietz WH Inaccuracies in self-reported intake identified by comparison with the doubly labelled water method Can. J. Physiol. Pharmacol. 1990 68 7 941-949
[24]
U. Ruth Charrondiere, D.H., Stadlmayr, B.: FAO/INFOODS databases, density database version 2.0 (2012) http://www.fao.org/3/ap815e/ap815e.pdf
[25]
Xie J, Girshick R, and Farhadi A Leibe B, Matas J, Sebe N, and Welling M Deep3D: fully automatic 2D-to-3D video conversion with deep convolutional neural networks Computer Vision – ECCV 2016 2016 Cham Springer 842-857
[26]
Xu, C., He, Y., Khannan, N., Parra, A., Boushey, C., Delp, E.: Image-based food volume estimation. In: Proceedings of the 5th International Workshop on Multimedia for Cooking & Eating Activities, pp. 75–80 (2013)
[27]
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1851–1858 (2017)

Cited By

View all
  • (2023)Survey on food intake methods using visual technologiesProceedings of the 8th international Workshop on Sensor-Based Activity Recognition and Artificial Intelligence10.1145/3615834.3615839(1-11)Online publication date: 21-Sep-2023
  • (2022)DepthGrillCamProceedings of the 7th International Workshop on Multimedia Assisted Dietary Management10.1145/3552484.3555752(55-59)Online publication date: 10-Oct-2022
  • (2021)Investigating Preferred Food Description Practices in Digital Food JournalingProceedings of the 2021 ACM Designing Interactive Systems Conference10.1145/3461778.3462145(589-605)Online publication date: 28-Jun-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Universal Access in Human-Computer Interaction. Applications and Practice: 14th International Conference, UAHCI 2020, Held as Part of the 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, July 19–24, 2020, Proceedings, Part II
Jul 2020
646 pages
ISBN:978-3-030-49107-9
DOI:10.1007/978-3-030-49108-6
  • Editors:
  • Margherita Antona,
  • Constantine Stephanidis

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 19 July 2020

Author Tags

  1. Food volume estimation
  2. Monocular depth estimation
  3. Food image processing
  4. Deep learning

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Survey on food intake methods using visual technologiesProceedings of the 8th international Workshop on Sensor-Based Activity Recognition and Artificial Intelligence10.1145/3615834.3615839(1-11)Online publication date: 21-Sep-2023
  • (2022)DepthGrillCamProceedings of the 7th International Workshop on Multimedia Assisted Dietary Management10.1145/3552484.3555752(55-59)Online publication date: 10-Oct-2022
  • (2021)Investigating Preferred Food Description Practices in Digital Food JournalingProceedings of the 2021 ACM Designing Interactive Systems Conference10.1145/3461778.3462145(589-605)Online publication date: 28-Jun-2021

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media