A client sends a request with a valid Bearer token. The Auth Filter validates the signature and checks the token has not expired. The user identity is extracted and attached to the request context before the request proceeds to the target service.
sequenceDiagram
autonumber
participant Client
participant AuthFilter as Auth Filter
participant TargetService as Target Service
Client->>AuthFilter: POST /api/v1/chat {Authorization: Bearer token, body}
AuthFilter->>AuthFilter: extract token from Authorization header
AuthFilter->>AuthFilter: validate JWT signature against secret
AuthFilter->>AuthFilter: validate token expiry
AuthFilter->>AuthFilter: extract user identity from claims
AuthFilter-->>TargetService: request forwarded, user context attached
TargetService-->>Client: 200 response
A client sends a request with a missing, malformed, or expired token. The Auth Filter rejects the request immediately and no downstream service is invoked.
sequenceDiagram
autonumber
participant Client
participant AuthFilter as Auth Filter
Client->>AuthFilter: POST /api/v1/chat {Authorization: Bearer token, body}
AuthFilter->>AuthFilter: extract token from Authorization header
alt Token missing or malformed
AuthFilter-->>Client: 401 {status: error, error: {code: INVALID_TOKEN, message: token is missing or malformed}}
else Token signature invalid
AuthFilter-->>Client: 401 {status: error, error: {code: INVALID_TOKEN, message: token signature is invalid}}
else Token expired
AuthFilter-->>Client: 401 {status: error, error: {code: TOKEN_EXPIRED, message: token has expired}}
end
A user sends a plain English message to the Telegram bot asking about pod or system status. The Telegram Bot Service receives the webhook, sends the message text to the local Ollama instance on the home server for intent classification, and routes the resolved intent to the appropriate internal service. The result is formatted as a natural language response and sent back to the user via the Telegram Bot API.
sequenceDiagram
autonumber
participant Telegram as Telegram Platform
participant BotService as Telegram Bot Service
participant LocalOllama as Local Ollama (Home Server)
participant GpuManager as GPU Lifecycle Manager
participant CostTracker as Cost Tracker
participant TelegramAPI as Telegram Bot API
Telegram->>BotService: POST /api/v1/bot/telegram {secret_token, message: {chat_id, text: is the pod running?}}
BotService->>BotService: validate Telegram secret token header
BotService->>LocalOllama: POST /api/generate {model: phi3:mini, prompt: classify intent from message text}
LocalOllama-->>BotService: {intent: POD_STATUS, params: {}}
alt Intent is POD_STATUS
BotService->>GpuManager: getStatus()
GpuManager-->>BotService: {status: READY, uptime: 23 minutes}
BotService->>CostTracker: getSessionEstimate()
CostTracker-->>BotService: {estimatedCost: $0.13, sessionDuration: 23 minutes}
BotService->>BotService: format natural language response
BotService->>TelegramAPI: POST /sendMessage {chat_id, text: Pod has been running for 23 minutes, estimated cost so far is $0.13}
TelegramAPI-->>BotService: message delivered
else Intent is POD_STOP
BotService->>GpuManager: requestShutdown()
GpuManager-->>BotService: shutdown initiated
BotService->>TelegramAPI: POST /sendMessage {chat_id, text: Shutting down the pod now}
TelegramAPI-->>BotService: message delivered
else Intent is POD_START
BotService->>GpuManager: requestStart()
GpuManager-->>BotService: pod starting
BotService->>TelegramAPI: POST /sendMessage {chat_id, text: Starting the pod, will take about a minute}
TelegramAPI-->>BotService: message delivered
else Intent is COST_SUMMARY
BotService->>CostTracker: getMonthlySummary()
CostTracker-->>BotService: {totalHours: 18, estimatedTotal: $4.90, sessionCount: 7}
BotService->>TelegramAPI: POST /sendMessage {chat_id, text: This month you have run 7 sessions totalling 18 hours, estimated cost is $4.90}
TelegramAPI-->>BotService: message delivered
else Intent unrecognised
BotService->>TelegramAPI: POST /sendMessage {chat_id, text: I did not understand that. You can ask about pod status, cost, or ask me to start or stop the pod}
TelegramAPI-->>BotService: message delivered
end
BotService-->>Telegram: 200 OK
A user sends a message triggering a fantasy mode generation through the Telegram bot. Because image and story generation can take 30 seconds or more, the bot sends an immediate acknowledgement to avoid Telegram timeout, then processes the request and sends a follow-up message with the result when complete.
sequenceDiagram
autonumber
participant Telegram as Telegram Platform
participant BotService as Telegram Bot Service
participant LocalOllama as Local Ollama (Home Server)
participant TelegramAPI as Telegram Bot API
participant Orchestrator as Fantasy Orchestrator
participant GpuManager as GPU Lifecycle Manager
Telegram->>BotService: POST /api/v1/bot/telegram {secret_token, message: {chat_id, text: generate a fantasy story from this image, image attached}}
BotService->>BotService: validate Telegram secret token
BotService->>LocalOllama: POST /api/generate {model: phi3:mini, prompt: classify intent}
LocalOllama-->>BotService: {intent: FANTASY_GENERATE, params: {has_image: true}}
BotService->>TelegramAPI: POST /sendMessage {chat_id, text: On it! Generating your fantasy story and illustration, this may take up to a minute}
TelegramAPI-->>BotService: acknowledgement sent
BotService-->>Telegram: 200 OK
Note over BotService, Orchestrator: Processing continues asynchronously after webhook response
BotService->>GpuManager: getStatus()
alt Pod is STOPPED
GpuManager-->>BotService: STOPPED
BotService->>GpuManager: requestStart()
Note over BotService: waits for pod to reach READY before continuing
GpuManager-->>BotService: READY
else Pod is READY
GpuManager-->>BotService: READY
end
BotService->>Orchestrator: generateFantasy(image_bytes, prompt)
Orchestrator-->>BotService: {story: text, image_base64: encoded image}
BotService->>TelegramAPI: POST /sendPhoto {chat_id, photo: image, caption: story text}
TelegramAPI-->>BotService: result delivered to user
A service requests the current connection details for reaching Ollama and ComfyUI on the GPU pod. When RunPod is the configured provider, the adapter constructs the connection URLs from the static pod ID without any API call. The pod ID never changes for the lifetime of the pod so the URLs are always predictable.
sequenceDiagram
autonumber
participant Service as Requesting Service
participant Provider as Provider Port
participant RunPodAdapter as RunPod Adapter
Service->>Provider: getConnectionDetails()
Provider->>RunPodAdapter: getConnectionDetails()
RunPodAdapter->>RunPodAdapter: read pod ID from configuration
RunPodAdapter->>RunPodAdapter: construct Ollama URL as {podId}-11434.proxy.runpod.net
RunPodAdapter->>RunPodAdapter: construct ComfyUI URL as {podId}-8188.proxy.runpod.net
RunPodAdapter-->>Provider: {ollamaUrl, comfyUrl}
Provider-->>Service: {ollamaUrl, comfyUrl}
When Vast.ai is the configured provider, the adapter must query the Vast.ai API on each status check to resolve the current public IP and mapped external ports because Vast.ai does not provide stable URLs. The internal ports (11434, 8188) are mapped to random external ports that can change on each pod restart.
sequenceDiagram
autonumber
participant Service as Requesting Service
participant Provider as Provider Port
participant VastAiAdapter as Vast.ai Adapter
participant VastAiAPI as Vast.ai REST API
Service->>Provider: getConnectionDetails()
Provider->>VastAiAdapter: getConnectionDetails()
VastAiAdapter->>VastAiAPI: GET /api/v0/instances/{instanceId} {Authorization: Bearer api_key}
alt Instance found and running
VastAiAPI-->>VastAiAdapter: {public_ip_addr, ports: [{internal: 11434, external: 54321}, {internal: 8188, external: 54322}]}
VastAiAdapter->>VastAiAdapter: resolve external port for internal 11434
VastAiAdapter->>VastAiAdapter: resolve external port for internal 8188
VastAiAdapter->>VastAiAdapter: construct ollamaUrl as http://{public_ip}:54321
VastAiAdapter->>VastAiAdapter: construct comfyUrl as http://{public_ip}:54322
VastAiAdapter-->>Provider: {ollamaUrl, comfyUrl}
Provider-->>Service: {ollamaUrl, comfyUrl}
else Instance not found or not running
VastAiAPI-->>VastAiAdapter: 404 or instance status not running
VastAiAdapter-->>Provider: {error: instance not reachable}
Provider-->>Service: {error: instance not reachable}
end
The GPU Lifecycle Manager calls start on the Provider Port when a pod needs to be started. The RunPod adapter calls the RunPod REST API to start the pod by ID.