⚙️ Ainfera Routing — v0 Build Spec (methodology v1.2 → flagship)

Build contract bridging Routing Methodology v1.2 → the ainfera-ai/routing flagship (AIN-188). The methodology is the theory; this page is the buildable cut — what v0 ships, the cold-start data it needs, and the v0→v1→v2 phasing.

Spec: Routing Methodology v1.2 · Repo: github.com/ainfera-ai/routing · Locked 2026-05-22

A · Gap — repo today vs. what v0 needs

The flagship repo currently ships a customer-facing demo, not the routing brain.

In repo now	Missing — v0 builds it
README + quickstart (`scripts/ainfera-e2e.sh`)	`q_prior` scoring + seed table
`STRATEGY.md` positioning	Constrained objective (`M_allowed` veto → argmax)
routing-policy editor (PR #49)	§16 outcome capture (write-only)
E2E demo (routed inference + signed audit)	Exact-match cache · provider fallback · drain-proof budgets · deterministic replay

v0 turns the demo shell into a real router.

A2 · Architecture (LOCKED 2026-05-22)

Brain in routing, runtime in api.

ainfera-ai/routing ships the decision core — q_prior + the constrained optimizer — as an importable policy engine, plus the public spec + policy templates + routing-policy.schema.json. This is the open flagship.
ainfera-ai/api (/v1/inference) imports that engine, executes dispatch, and writes §16 outcomes to the store. The runtime/gateway stays in api.
Spark Brain-Factory trains + replays offline from the same outcome store (Methodology v1.2 / AIN-188) and hot-swaps learned policy back into routing.

Clean line: routing = brain (decide) · api = body (dispatch + capture) · Spark = gym (train/replay).

A2 · Where the brain lives (LOCKED 2026-05-22)

Two repos, clean split:

ainfera-ai/routing — the brain: q_prior seed, the constrained optimizer (§D), policy schema/templates, deterministic replay. Shipped as an importable policy engine + the public spec. Stays the public flagship.
ainfera-ai/api — the runtime: /v1/inference imports the routing policy engine, dispatches to the chosen provider, writes the §16 outcome row. Runtime owns I/O + billing; the brain owns the decision.
Spark Brain-Factory — trains/replays offline (v1+) from the same routing_outcomes store; ships updated policy back to routing.

The decision function is identical in production, replay, and training — one engine, one source of truth.

B · v0 scope

Ships in v0 (no learning):