<aside>
Build a reusable AI voice assistant that can receive voice messages on Telegram, transcribe them, respond using an LLM, and reply back with natural-sounding audio using ElevenLabs.
Most Telegram bots are text-only and lack voice interaction. There's no easy plug-and-play AI voice assistant that supports transcribing, reasoning with context, and replying with human-like speech.
A modular and reusable workflow using n8n, OpenRouter, and ElevenLabs that:
πΉ Step 1 β Trigger: Telegram Bot Trigger (listens for new voice/audio messages).
πΉ Step 2 β Download file from Telegram using "Get file".
πΉ Step 3 β Transcribe audio to text using speech-to-text node.
πΉ Step 4 β Send transcription to AI Agent (connected to OpenRouter chat model). Process natural language reasoning via OpenRouter (e.g., GPT-4, Claude, etc). AI Agent generates a textual response.
πΉ Step 5 β Text-to-speech conversion using ElevenLabs.
πΉ Step 6 β Send voice message back to user.
πΉ Step 7 β Trigger: Webhook (POST) receives user input (e.g., from ElevenLabs Voice Agent)
πΉ Step 8 β Message is passed to a Perplexity
πΉ Step 9 β Response is routed to an AI Agent that can use memory, context, or tools if needed.
πΉ Step 10 β The final AI-generated response is sent back via Respond to Webhook.
</aside>

π Download & Explore the Workflow: