Valid JWT Flow

A client sends a request with a valid Bearer token. The Auth Filter validates the signature and checks the token has not expired. The user identity is extracted and attached to the request context before the request proceeds to the target service.

sequenceDiagram
    autonumber
    participant Client
    participant AuthFilter as Auth Filter
    participant TargetService as Target Service

    Client->>AuthFilter: POST /api/v1/chat {Authorization: Bearer token, body}
    AuthFilter->>AuthFilter: extract token from Authorization header
    AuthFilter->>AuthFilter: validate JWT signature against secret
    AuthFilter->>AuthFilter: validate token expiry
    AuthFilter->>AuthFilter: extract user identity from claims
    AuthFilter-->>TargetService: request forwarded, user context attached
    TargetService-->>Client: 200 response

Invalid JWT Flow

A client sends a request with a missing, malformed, or expired token. The Auth Filter rejects the request immediately and no downstream service is invoked.

sequenceDiagram
    autonumber
    participant Client
    participant AuthFilter as Auth Filter

    Client->>AuthFilter: POST /api/v1/chat {Authorization: Bearer token, body}
    AuthFilter->>AuthFilter: extract token from Authorization header

    alt Token missing or malformed
        AuthFilter-->>Client: 401 {status: error, error: {code: INVALID_TOKEN, message: token is missing or malformed}}
    else Token signature invalid
        AuthFilter-->>Client: 401 {status: error, error: {code: INVALID_TOKEN, message: token signature is invalid}}
    else Token expired
        AuthFilter-->>Client: 401 {status: error, error: {code: TOKEN_EXPIRED, message: token has expired}}
    end

Telegram Natural Language Query

A user sends a plain English message to the Telegram bot asking about pod or system status. The Telegram Bot Service receives the webhook, sends the message text to the local Ollama instance on the home server for intent classification, and routes the resolved intent to the appropriate internal service. The result is formatted as a natural language response and sent back to the user via the Telegram Bot API.

sequenceDiagram
    autonumber
    participant Telegram as Telegram Platform
    participant BotService as Telegram Bot Service
    participant LocalOllama as Local Ollama (Home Server)
    participant GpuManager as GPU Lifecycle Manager
    participant CostTracker as Cost Tracker
    participant TelegramAPI as Telegram Bot API

    Telegram->>BotService: POST /api/v1/bot/telegram {secret_token, message: {chat_id, text: is the pod running?}}
    BotService->>BotService: validate Telegram secret token header

    BotService->>LocalOllama: POST /api/generate {model: phi3:mini, prompt: classify intent from message text}
    LocalOllama-->>BotService: {intent: POD_STATUS, params: {}}

    alt Intent is POD_STATUS
        BotService->>GpuManager: getStatus()
        GpuManager-->>BotService: {status: READY, uptime: 23 minutes}
        BotService->>CostTracker: getSessionEstimate()
        CostTracker-->>BotService: {estimatedCost: $0.13, sessionDuration: 23 minutes}
        BotService->>BotService: format natural language response
        BotService->>TelegramAPI: POST /sendMessage {chat_id, text: Pod has been running for 23 minutes, estimated cost so far is $0.13}
        TelegramAPI-->>BotService: message delivered
    else Intent is POD_STOP
        BotService->>GpuManager: requestShutdown()
        GpuManager-->>BotService: shutdown initiated
        BotService->>TelegramAPI: POST /sendMessage {chat_id, text: Shutting down the pod now}
        TelegramAPI-->>BotService: message delivered
    else Intent is POD_START
        BotService->>GpuManager: requestStart()
        GpuManager-->>BotService: pod starting
        BotService->>TelegramAPI: POST /sendMessage {chat_id, text: Starting the pod, will take about a minute}
        TelegramAPI-->>BotService: message delivered
    else Intent is COST_SUMMARY
        BotService->>CostTracker: getMonthlySummary()
        CostTracker-->>BotService: {totalHours: 18, estimatedTotal: $4.90, sessionCount: 7}
        BotService->>TelegramAPI: POST /sendMessage {chat_id, text: This month you have run 7 sessions totalling 18 hours, estimated cost is $4.90}
        TelegramAPI-->>BotService: message delivered
    else Intent unrecognised
        BotService->>TelegramAPI: POST /sendMessage {chat_id, text: I did not understand that. You can ask about pod status, cost, or ask me to start or stop the pod}
        TelegramAPI-->>BotService: message delivered
    end

    BotService-->>Telegram: 200 OK

Telegram Long Running Command

A user sends a message triggering a fantasy mode generation through the Telegram bot. Because image and story generation can take 30 seconds or more, the bot sends an immediate acknowledgement to avoid Telegram timeout, then processes the request and sends a follow-up message with the result when complete.

sequenceDiagram
    autonumber
    participant Telegram as Telegram Platform
    participant BotService as Telegram Bot Service
    participant LocalOllama as Local Ollama (Home Server)
    participant TelegramAPI as Telegram Bot API
    participant Orchestrator as Fantasy Orchestrator
    participant GpuManager as GPU Lifecycle Manager

    Telegram->>BotService: POST /api/v1/bot/telegram {secret_token, message: {chat_id, text: generate a fantasy story from this image, image attached}}
    BotService->>BotService: validate Telegram secret token

    BotService->>LocalOllama: POST /api/generate {model: phi3:mini, prompt: classify intent}
    LocalOllama-->>BotService: {intent: FANTASY_GENERATE, params: {has_image: true}}

    BotService->>TelegramAPI: POST /sendMessage {chat_id, text: On it! Generating your fantasy story and illustration, this may take up to a minute}
    TelegramAPI-->>BotService: acknowledgement sent
    BotService-->>Telegram: 200 OK

    Note over BotService, Orchestrator: Processing continues asynchronously after webhook response

    BotService->>GpuManager: getStatus()

    alt Pod is STOPPED
        GpuManager-->>BotService: STOPPED
        BotService->>GpuManager: requestStart()
        Note over BotService: waits for pod to reach READY before continuing
        GpuManager-->>BotService: READY
    else Pod is READY
        GpuManager-->>BotService: READY
    end

    BotService->>Orchestrator: generateFantasy(image_bytes, prompt)
    Orchestrator-->>BotService: {story: text, image_base64: encoded image}

    BotService->>TelegramAPI: POST /sendPhoto {chat_id, photo: image, caption: story text}
    TelegramAPI-->>BotService: result delivered to user

RunPod Connection Detail Resolution

A service requests the current connection details for reaching Ollama and ComfyUI on the GPU pod. When RunPod is the configured provider, the adapter constructs the connection URLs from the static pod ID without any API call. The pod ID never changes for the lifetime of the pod so the URLs are always predictable.

sequenceDiagram
    autonumber
    participant Service as Requesting Service
    participant Provider as Provider Port
    participant RunPodAdapter as RunPod Adapter

    Service->>Provider: getConnectionDetails()
    Provider->>RunPodAdapter: getConnectionDetails()
    RunPodAdapter->>RunPodAdapter: read pod ID from configuration
    RunPodAdapter->>RunPodAdapter: construct Ollama URL as {podId}-11434.proxy.runpod.net
    RunPodAdapter->>RunPodAdapter: construct ComfyUI URL as {podId}-8188.proxy.runpod.net
    RunPodAdapter-->>Provider: {ollamaUrl, comfyUrl}
    Provider-->>Service: {ollamaUrl, comfyUrl}

Vast.ai Dynamic IP Resolution

When Vast.ai is the configured provider, the adapter must query the Vast.ai API on each status check to resolve the current public IP and mapped external ports because Vast.ai does not provide stable URLs. The internal ports (11434, 8188) are mapped to random external ports that can change on each pod restart.

sequenceDiagram
    autonumber
    participant Service as Requesting Service
    participant Provider as Provider Port
    participant VastAiAdapter as Vast.ai Adapter
    participant VastAiAPI as Vast.ai REST API

    Service->>Provider: getConnectionDetails()
    Provider->>VastAiAdapter: getConnectionDetails()

    VastAiAdapter->>VastAiAPI: GET /api/v0/instances/{instanceId} {Authorization: Bearer api_key}

    alt Instance found and running
        VastAiAPI-->>VastAiAdapter: {public_ip_addr, ports: [{internal: 11434, external: 54321}, {internal: 8188, external: 54322}]}
        VastAiAdapter->>VastAiAdapter: resolve external port for internal 11434
        VastAiAdapter->>VastAiAdapter: resolve external port for internal 8188
        VastAiAdapter->>VastAiAdapter: construct ollamaUrl as http://{public_ip}:54321
        VastAiAdapter->>VastAiAdapter: construct comfyUrl as http://{public_ip}:54322
        VastAiAdapter-->>Provider: {ollamaUrl, comfyUrl}
        Provider-->>Service: {ollamaUrl, comfyUrl}
    else Instance not found or not running
        VastAiAPI-->>VastAiAdapter: 404 or instance status not running
        VastAiAdapter-->>Provider: {error: instance not reachable}
        Provider-->>Service: {error: instance not reachable}
    end

Provider Start - RunPod

The GPU Lifecycle Manager calls start on the Provider Port when a pod needs to be started. The RunPod adapter calls the RunPod REST API to start the pod by ID.