Lakshyajeet Domyan, Taylor Le, Apoorav Rathore
Logging meals is a pain. Studies consistently show that people under-report daily calories by 20-50% when manually tracking their food intake. This significant margin of error can derail health goals, fitness plans, and dietary monitoring for millions of people. For our Data Science Lab Final Project, we set out to solve this problem by creating an end-to-end food recognition and calorie estimation system that works in near real-time. The goal was ambitious but straightforward: snap a photo of your food and get an accurate calorie estimate before your coffee cools down.
To create a successful system, we established two critical performance targets:
These benchmarks would ensure the system was accurate.
The foundation of this project relied on two key data sources:
The Food-101 dataset provided the visual training corpus with 101,000 images spread evenly across 101 food categories. This dataset offers exceptional variety in image quality and composition, ranging from high-resolution 512-pixel DSLR photographs to 135-pixel smartphone snapshots. This diversity proved invaluable for building a robust model capable of handling real-world image variability.
For nutritional information, we utilized a publicly available dataset from Kaggle titled “Calories in Food Items (per 100 grams)” by contributor kkhandekar. From this comprehensive dataset, we extracted and adapted a subset of 108 standardized food items, each with associated caloric content per 100g, mapped to categories relevant to our Food-101 classification task.
Raw data rarely comes in the ideal format for machine learning applications. To prepare the datasets for model training and inference, we conducted multiple preprocessing steps:
Visual Data Processing:
This careful preparation established the foundation for reliable model training and performance.
For our current implementation, we have chosen to utilize a ResNet-18 architecture pre-trained on ImageNet and fine-tuned on the Food-101 dataset.