Table of contents


📌 Main()


🗂 Logs (Daily / Weekly Notes)

Detailed daily logs and documentation:

📓 Full Logs & Documentation

Quick logs:

Date Summary Notes Tags
Sep 13, 2025 Research, set up, data exploration Data exploration and setup
Full Log data, research
Sep 13, 2025 Data labeler / navigation tool Tool created to explore and label data efficiently
Full Log data, EDA
Sep 19, 2025 Data review & heuristics ideation Reviewing data and designing heuristics
Full Log data, heuristics
Sep 20, 2025 Training data labeling; Dataset Version 1 (Training) Completed labeling of test dataset; Versioned dataset using dvc under data-v1
Full Log data, train, dvc
Sep 23, 2025 Testing data labeling Completed labeling of test dataset data, test
Sep 24, 2025 Dataset Version 2 (Testing) Versioned dataset using dvc under data-v2
Full Log data, dvc
Sep 26, 2025 Data documentation and notes Notes and documentation on dataset & heuristics
Data Notes; Heuristics Notes documentation
Sep 28, 2025 Uploaded data to HuggingFace and Kaggle Dataset shared on HuggingFace and Kaggle
GitHub; Kaggle; HuggingFace data
Oct 1, 2025 Preprocessing for finetuning Train data is preprocessed and prepared for finetuning
Full Log data, finetune
Oct 5, 2025 Attempts to finetune base model and QA model Initial results from finetuning LayoutLM models using LoRA
Full Log finetune, LoRA
Oct 14, 2025 Finetune base model w/ LoRA; postprocessing Baseline model. 81.42% acc. after postprocessing (w/o human-in-the-loop)
Full Log; HuggingFace model finetune, LoRA
Nov 26, 2025 FastAPI + Gradio deployment Model deployment using FastAPI with a Gradio frontend
Full Log model, frontend,
deployment
Nov 29, 2025 Heuristics logic, containerization, and logging Added heuristic logic to webapp and containerized with logging
Full Log containerization,
deployment
Nov 30, 2025 Testing, CI/CD, documentation, initial release Documentation, tests, and CI/CD added. MVP is released under Release v0.1.0
Full Log testing, CI/CD, documentation, release
Dec 9, 2025 Evaluations and Dataset Version 2.1 Model evaluations framework.
Versioned dataset under data-v2.1
Full Log; HuggingFace; Kaggle data, evaluations
Dec 10, 2025 Experiment Tracking via Weights and Biases. Code Refactor and Documentations Implemented experiment tracking via W&B. Prepared repo for new features by refactoring app.py and data labeling tool. Added documentations.
Full Log experiments, evaluations, refactor, documentation
Dec 23, 2025—Jan 4, 2026 ONNX, Gemini Models Evaluations Exported and quantized models to use ONNX Runtime for serving. Benchmarking Gemini and ONNX models. Release v0.2.3.
Full Log quantization, experiments, evaluations, documentations

🏷️ GitHub Tags and Releases


Date Name / Link Descriptions
Sep 13, 2025 data-v1 Labeled data includes train (ambiguous)
Sep 24, 2025 data-v2 Labeled data includes train + test (ambiguous)
Nov 30, 2025 v0.1.0 MVP release
Dec 9, 2025 data-v2.1 Minor fix in test labels. See changes here.
Dec 31, 2025 v0.2.2 Benchmarking, ONNX for inference.
Jan 4, 2026 v0.2.3 Minor update to benchmarks: use ONNX as fallback

📎 Miscellaneous Links / Notes