<aside>
๐ Project Highlights
๐ฏ Overview
This system simulates a real healthcare clinic receptionist that can:
The assistant uses:
๐ Project Structure
healthcare-voice-assistant/
โโโ src/
โ โโโ conversation_engine.py # LLM orchestration & function calling
โ โโโ voice_handler.py # STT/TTS integration
โ โโโ appointment_service.py # Appointment scheduling logic
โ โโโ insurance_service.py # Insurance verification logic
โโโ data/
โ โโโ appointments.json # Mock appointment calendar
โ โโโ insurance_providers.json # Accepted insurance providers
โ โโโ clinic_info.json # Clinic information
โโโ demos/ # Complete demo recordings (3 MP3s)
โโโ recordings/ # Individual audio clips from testing
โโโ main.py # Application entry point
โโโ requirements.txt # Python dependencies
โโโ .env.example # Environment variables template
โโโ README.md # This file
โโโ SYSTEM_DESIGN.md # Architecture documentation
๐ Setup Instructions
1. Prerequisites
2. Installation
# Clone or extract the project
cd healthcare-voice-assistant
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\\Scripts\\activate
# Install dependencies
pip install -r requirements.txt
3. Configure API Keys
# Copy the example env file
cp .env.example .env
# Edit .env and add your API keys
OPENAI_API_KEY=your_openai_api_key_here
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
ELEVENLABS_VOICE_ID=21m00Tcm4TlvDq8ikWAM # Optional: Rachel voice (default)
Getting API Keys:
๐ฎ Usage
Run the Application
python main.py
Available Modes
1. Text Modeย - Interactive chat (no voice)
2. Voice Simulation - Appointment Scheduling
recordings/3. Voice Simulation - Insurance Verification
4. Voice Simulation - No Available Slot
5. Process Audio File
๐ Generated Recordings During Testing
When running voice simulations, individual audio clips are saved in theย recordings/ย folder:
appointment_01_greeting.mp3appointment_02_response.mp3These files show the step-by-step conversation flow.
๐ง How It Works
Architecture
โโโโโโโโโโโโโโโ
โ User โ
โ Audio โ
โโโโโโโโฌโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Speech-to-Text โ (OpenAI Whisper)
โ (STT) โ
โโโโโโโโฌโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Conversation Engine โ
โ - GPT-4o for dialogue โ
โ - Function calling for: โ
โ * check_available_slots โ
โ * book_appointment โ
โ * verify_insurance โ
โ * get_clinic_info โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Text-to-Speech โ (ElevenLabs)
โ (TTS) โ
โโโโโโโโฌโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโ
โ Audio โ
โ Response โ
โโโโโโโโโโโโโโโ
Key Components
1. Conversation Engineย (conversation_engine.py)
2. Voice Handlerย (voice_handler.py)
3. Appointment Serviceย (appointment_service.py)
4. Insurance Serviceย (insurance_service.py)
Why GPT-4o over GPT-3.5?
Why ElevenLabs over alternatives?
Architecture Choices:

View GitHub Repository: