10. 🧾 Receipt OCR
Overview
A service that scans photos of receipts (printed or handwritten), extracts line items, merchant, date, and total, then categorizes expenses automatically.
Primary Use Cases
- Small-business owners tracking spend for bookkeeping
- Freelancers gathering receipts for mileage/expenses
- Personal users automating expense reports
Key Features
- Mobile/web upload of receipt image
- OCR pipeline tailored for small text and tabular layouts
- Line-item parser: item, quantity, unit price, subtotal
- Merchant/date parsing with fuzzy matching
- Auto-categorization via keyword & ML model (e.g. “Travel,” “Meals”)
- Export to CSV or accounting software (QuickBooks, Xero)
Tech Stack
- Frontend: React + TypeScript (upload form + preview)
- Backend: FastAPI (Python)
- AI Models:
- OCR: fine-tuned TrOCR (
microsoft/trocr-small-hybrid
)
- Parser: custom BERT-based classifier for categories
- Storage: PostgreSQL for records; S3 for images
Architecture