More Web Proxy on the site http://driver.im/

Article

Single Image-Based Food Volume Estimation Using Monocular Depth-Prediction Networks

Authors:

Alexandros Graikos,

Vasileios Charisis,

Dimitrios Iakovakis,

Stelios Hadjidimitriou,

Leontios HadjileontiadisAuthors Info & Claims

Universal Access in Human-Computer Interaction. Applications and Practice: 14th International Conference, UAHCI 2020, Held as Part of the 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, July 19–24, 2020, Proceedings, Part II

Pages 532 - 543

https://doi.org/10.1007/978-3-030-49108-6_38

Published: 19 July 2020 Publication History

Abstract

In this work, we present a system that can estimate food volume from a single input image, by utilizing the latest advancements in monocular depth estimation. We employ a state-of-the-art, monocular depth prediction network architecture, trained exclusively on videos, which we obtain from the publicly available EPIC-KITCHENS and our own collected food videos datasets. Alongside it, an instance segmentation network is trained on the UNIMIB2016 food-image dataset, to detect and produce segmentation masks for each of the different foods depicted in the given image. Combining the predicted depth map, segmentation masks and known camera intrinsic parameters, we generate three-dimensional (3D) point cloud representations of the target food objects and approximate their volumes with our point cloud-to-volume algorithm. We evaluate our system on a test set, consisting of images portraying various foods and their respective measured volumes, as well as combinations of foods placed in a single image.

References

[1]

U.S. Department of Agriculture, A.R.S.: FoodData central (2019). https://fdc.nal.usda.gov/

[2]

Almaghrabi, R., Villalobos, G., Pouladzadeh, P., Shirmohammadi, S.: A novel method for measuring nutrition intake based on food image. In: 2012 IEEE International Instrumentation and Measurement Technology Conference Proceedings, pp. 366–370. IEEE (2012)

[3]

Bossard L, Guillaumin M, and Van Gool L Fleet D, Pajdla T, Schiele B, and Tuytelaars T Food-101 – mining discriminative components with random forests Computer Vision – ECCV 2014 2014 Cham Springer 446-461

[4]

Chen, M., Dhingra, K., Wu, W., Yang, L., Sukthankar, R., Yang, J.: PFID: pittsburgh fast-food image dataset. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 289–292. IEEE (2009)

[5]

Ciocca G, Napoletano P, and Schettini R Food recognition: a new dataset, experiments, and results IEEE J. Biomed. Health Inform. 2016 21 3 588-598

[6]

Cordeiro, F., Bales, E., Cherry, E., Fogarty, J.: Rethinking the mobile food journal: exploring opportunities for lightweight photo-based capture. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3207–3216 (2015)

[7]

Cordeiro, F., et al.: Barriers and negative nudges: Exploring challenges in food journaling. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 1159–1162 (2015)

[8]

Damen D et al. Ferrari V, Hebert M, Sminchisescu C, Weiss Y, et al. Scaling egocentric vision: the epic-kitchens dataset Computer Vision – ECCV 2018 2018 Cham Springer 753-771

Digital Library

[9]

Dehais J, Anthimopoulos M, Shevchik S, and Mougiakakou S Two-view 3D reconstruction for food volume estimation IEEE Trans. Multimedia 2016 19 5 1090-1099

Digital Library

[10]

Edelsbrunner H and Harer J Computational Topology: An Introduction 2010 Providence American Mathematical Society

[11]

Ege T and Yanai K Image-based food calorie estimation using recipe information IEICE Tran. Inf. Syst. 2018 101 5 1333-1341

[12]

Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

[13]

Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 270–279 (2017)

[14]

Godard, C., Mac Aodha, O., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3838 (2019)

[15]

Hassannejad, H., Matrella, G., Ciampolini, P., De Munari, I., Mordonini, M., Cagnoni, S.: Food image recognition using very deep convolutional networks. In: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, pp. 41–49 (2016)

[16]

Hassannejad H, Matrella G, Ciampolini P, Munari ID, Mordonini M, and Cagnoni S A new approach to image-based estimation of food volume Algorithms 2017 10 2 66

[17]

He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

[18]

International Food Information Council (IFIC) Foundation: 2019 Food and Health Survey (2019). https://foodinsight.org/wp-content/uploads/2019/05/IFIC-Foundation-2019-Food-and-Health-Report-FINAL.pdf

[19]

Liang, Y., Li, J.: Deep learning-based food calorie estimation method in dietary assessment. arXiv preprint arXiv:1706.04062 (2017)

[20]

Lin T-Y et al. Fleet D, Pajdla T, Schiele B, Tuytelaars T, et al. Microsoft COCO: common objects in context Computer Vision – ECCV 2014 2014 Cham Springer 740-755

[21]

Martinel, N., Foresti, G.L., Micheloni, C.: Wide-slice residual networks for food recognition. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 567–576. IEEE (2018)

[22]

Myers, A., et al.: Im2calories: towards an automated mobile vision food diary. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1233–1241 (2015)

[23]

Schoeller DA, Bandini LG, and Dietz WH Inaccuracies in self-reported intake identified by comparison with the doubly labelled water method Can. J. Physiol. Pharmacol. 1990 68 7 941-949

[24]

U. Ruth Charrondiere, D.H., Stadlmayr, B.: FAO/INFOODS databases, density database version 2.0 (2012) http://www.fao.org/3/ap815e/ap815e.pdf

[25]

Xie J, Girshick R, and Farhadi A Leibe B, Matas J, Sebe N, and Welling M Deep3D: fully automatic 2D-to-3D video conversion with deep convolutional neural networks Computer Vision – ECCV 2016 2016 Cham Springer 842-857

[26]

Xu, C., He, Y., Khannan, N., Parra, A., Boushey, C., Delp, E.: Image-based food volume estimation. In: Proceedings of the 5th International Workshop on Multimedia for Cooking & Eating Activities, pp. 75–80 (2013)

[27]

Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1851–1858 (2017)

Cited By

Dubey SKraft DDrueeke NBieber G(2023)Survey on food intake methods using visual technologiesProceedings of the 8th international Workshop on Sensor-Based Activity Recognition and Artificial Intelligence10.1145/3615834.3615839(1-11)Online publication date: 21-Sep-2023
https://dl.acm.org/doi/10.1145/3615834.3615839
Adachi KYanai KMougiakakou SFarinella GYanai KAllegra D(2022)DepthGrillCamProceedings of the 7th International Workshop on Multimedia Assisted Dietary Management10.1145/3552484.3555752(55-59)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3552484.3555752
M. Silva LA. Epstein D(2021)Investigating Preferred Food Description Practices in Digital Food JournalingProceedings of the 2021 ACM Designing Interactive Systems Conference10.1145/3461778.3462145(589-605)Online publication date: 28-Jun-2021
https://dl.acm.org/doi/10.1145/3461778.3462145

Recommendations

Transferring knowledge from monocular completion for self-supervised monocular depth estimation
Abstract
Monocular depth estimation is a very challenging task in computer vision, with the goal to predict per-pixel depth from a single RGB image. Supervised learning methods require large amounts of depth measurement data, which are time-consuming and ...
Semantic and Optical Flow Guided Self-supervised Monocular Depth and Ego-Motion Estimation
Image and Graphics
Abstract
The self-supervised depth and camera pose estimation methods are proposed to address the difficulty of acquiring the densely labeled ground-truth data and have achieved a great advance. As the stereo vision could constrain the predicted depth to a ...
Food Weight Estimation using Smartphone and Cutlery
IoT of Health '16: Proceedings of the First Workshop on IoT-enabled Healthcare and Wellness Technologies and Systems

In this era of Internet of Things (IoT), the healthcare system is one of the fields that has received a lot of attention from researchers. Daily-life things and objects such as mobile phones, watches, or shoes are coupled with sensors to make health ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Universal Access in Human-Computer Interaction. Applications and Practice: 14th International Conference, UAHCI 2020, Held as Part of the 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, July 19–24, 2020, Proceedings, Part II

Jul 2020

646 pages

ISBN:978-3-030-49107-9

DOI:10.1007/978-3-030-49108-6

Editors:
Margherita Antona
Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
,
Constantine Stephanidis
University of Crete and Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece

© Springer Nature Switzerland AG 2020.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 19 July 2020

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dubey SKraft DDrueeke NBieber G(2023)Survey on food intake methods using visual technologiesProceedings of the 8th international Workshop on Sensor-Based Activity Recognition and Artificial Intelligence10.1145/3615834.3615839(1-11)Online publication date: 21-Sep-2023
https://dl.acm.org/doi/10.1145/3615834.3615839
Adachi KYanai KMougiakakou SFarinella GYanai KAllegra D(2022)DepthGrillCamProceedings of the 7th International Workshop on Multimedia Assisted Dietary Management10.1145/3552484.3555752(55-59)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3552484.3555752
M. Silva LA. Epstein D(2021)Investigating Preferred Food Description Practices in Digital Food JournalingProceedings of the 2021 ACM Designing Interactive Systems Conference10.1145/3461778.3462145(589-605)Online publication date: 28-Jun-2021
https://dl.acm.org/doi/10.1145/3461778.3462145

View Options

View options

Figures

Tables

Media

View Table of Conten