This repository implements the DPF-Nutrition model, based on the paper DPF-Nutrition: Food Nutrition Estimation via Depth Prediction and Fusion, with minor modifications. The model has been trained and fine-tuned on overhead images from Google’s Nutrition5k dataset to estimate nutritional information from images of food items.
To run inference, please refer to:
- inference.py: Script-based inference
- inference.ipynb: Notebook-based inference
For optimal performance, input images should meet the following requirements:
- Format: RGB
- Resolution: 640 x 480 pixels
- Pixel Range: 0 to 255 (values do not need to reach exactly 255 but should be within this range)
Use this transformation to preprocess your images:
from torchvision import transforms
image_transform = transforms.Compose([
transforms.Resize((384, 384)),
transforms.ToTensor(),
transforms.Lambda(lambda x: x / 255.0)
])
Download the pre-trained model weights from the links below:
- DPT Model Weights (dpt.py): Google Drive Link
- DPF-Nutrition Model Weights (dpfnutrition.py): Google Drive Link