External testing, stress tests, and the version-by-version fixes that shaped the system


Testing Approach

Email Brain was tested through two channels: internal daily use (running scheduled tasks against a real inbox with 18 active contacts) and external first-user testing (a colleague installing from scratch and running the system independently, including intentional stress tests).

The external testing was more valuable. Setup gaps were invisible to me because I already knew how the system worked — first-user testing revealed the actual experience.


Version Timeline

v0.1.0 — March 3, 2026 (Initial release)

Five modes operational: Draft, Inbox Scan, Daily Briefing, Decision Extraction, Resource Scanner. Notion context system connected. Gmail integration working. Pre-draft email filtering and basic context retrieval in place.

v0.2.0 — March 6, 2026 (Critical bug fix)

Discovered that Notion's semantic search API was silently returning partial results — 10 of 18 contacts, with no error or warning. Eight active clients were simply excluded from every scan. Root cause: platform limitation in Notion's search, not a logic error.

Fix: Complete Contact Retrieval Protocol — mandatory direct database fetch, local filtering, fallback searches by client code, count verification, and deduplication. After the fix, all 18 contacts consistently retrieved.

This was the most important bug found during the entire project. Silent partial retrieval is a critical failure mode for any AI system that depends on retrieval — it doesn't break loudly, it just quietly misses things.

v0.3.0 — March 6, 2026 (First external feedback round)

Brett installed the system from scratch and ran his first morning scan. Findings: