Overview
The Voice Agent Runtime is a Python microservice that handles all real-time voice processing. It’s separate from TalkifAI Studio (the web app) and runs on a dedicated Google Cloud VM for performance. Repository:Livekit-Production-Agent
Stack: Python, FastAPI, LiveKit Agents SDK
Key Responsibilities
| Function | Implementation |
|---|---|
| LiveKit agent workers | entrypoint.py — Agent entry point |
| STT/LLM/TTS orchestration | providers/ — Provider integrations |
| Session lifecycle | session/ — Session creation and management |
| Function calling | tools/ — Custom and built-in tools |
| Recordings | services/recording_service.py → GCS |
| Transcripts | services/transcription_service.py |
| Noise cancellation | rnnoise_wrapper.py |
| Post-call analysis | services/analysis_service.py |
| Billing | services/billing_service.py |
| REST API | main.py — FastAPI on port 8000 |
Architecture
Session Lifecycle
Provider Integrations
STT Providers (providers/stt.py)
| Provider | Model | Characteristics |
|---|---|---|
| Deepgram | Nova 2 | Best real-time accuracy |
| OpenAI | Whisper | High accuracy, slightly slower |
| AssemblyAI | Universal 2 | Strong with accents |
LLM Providers (providers/llm.py)
| Provider | Models |
|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo |
| gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash |
TTS Providers (providers/voice.py)
| Provider | Characteristics |
|---|---|
| Cartesia Sonic | Low latency, natural |
| OpenAI TTS | High quality, multiple voices |
| ElevenLabs | Most natural, emotion-aware |
Batch Calling System
The runtime includes a Redis-backed batch calling system (batch_system/):
- ARQ for async job processing
- Two-level concurrency: Org limit + Batch limit
- Auto-retry: 3 attempts per call
- Scheduling: IANA timezone support
Environment Variables
Deployment
The runtime is deployed on a Google Cloud VM (not serverless) because:- LiveKit agent workers need persistent WebSocket connections
- Low-latency audio processing benefits from dedicated compute
- Batch calling workers need long-running processes
Dockerfile and .github/workflows/ for CI/CD details.