This document describes the technical implementation of the Legacy BMS PDF Import Pipeline - an automated system that receives policy premium PDFs via email, extracts structured data using a self-hosted AI model, validates the data against the legacy Business Management System (BMS), and stores the document and metadata directly into the Access/JET Engine database.
All infrastructure is self-hosted and on-premises - no data leaves the client's network.
| Component | Technology | Purpose |
|---|---|---|
| Workflow engine | n8n (self-hosted) | Orchestrates the full pipeline from email trigger to BMS storage |
| AI model | Ollama (self-hosted) | Processes and extracts structured data from PDFs on-prem |
| API bridge | Python / FastAPI | Exposes REST endpoints that connect to the Access/JET Engine database |
| Legacy BMS | Microsoft Access (JET Engine) | Core business management system - the target data store |
| Tunnel | ngrok | Exposes the local FastAPI bridge to n8n during development/demo |
Workflow name: Health Cheque Demo
Workflow ID: CSzK2rozmbjqooll
Execution order: v1
Binary mode: separate
→ 2. Extract PDF Attachment → 3. Extract PDF Data → 4. Lookup Client → 5. Store Document| Setting | Value |
|---|---|
| Type | n8n-nodes-base.gmailTrigger v1.3 |
| Poll interval | Every minute |
| Download attachments | Yes |
| Simple mode | Off (returns full message data) |
Polls the connected Gmail account every minute for new emails. Attachments are downloaded as binary data and passed to the next node.