All gateway endpoints follow consistent conventions across the entire API surface.
Base path: /api/v1
HTTP verb usage follows standard REST semantics. GET for read operations with no side effects. POST for operations that trigger processing or create resources. DELETE for termination operations such as stopping a pod.
All responses use a consistent envelope:
{
"status": "success | error | warming_up",
"data": {},
"error": null,
"meta": {
"requestId": "uuid",
"timestamp": "ISO8601"
}
}
When a pod is warming up and a request cannot be served immediately, the gateway returns HTTP 202 with status warming_up and an estimated wait time in the meta field rather than blocking the connection indefinitely.
Endpoint groups:
/api/v1/fantasy/* Fantasy mode pipeline
/api/v1/chat/* Ollama-compatible chat (used by OpenWebUI)
/api/v1/code/* Coding assistant
/api/v1/gpu/* Pod lifecycle management
/api/v1/health System health (no auth required)
/api/v1/cost/* Usage and cost estimates
/api/v1/bot/telegram Telegram webhook receiver