Overview
TalkifAI uses a credit-based billing system . Organizations purchase credits via Lemon Squeezy, which are consumed as voice sessions run.
The billing system is a separate FastAPI microservice running on Google Cloud Run.
Key features:
Pay-as-you-go credit system (1 credit = $1 = ~1 minute of voice conversation)
BYOC (Bring Your Own Carrier) — no telephony markup
Grace period of 5 ( c a l l s a l l o w e d e v e n w i t h n e g a t i v e b a l a n c e u p t o − 5 (calls allowed even with negative balance up to - 5 ( c a ll s a ll o w e d e v e n w i t hn e g a t i v e ba l an ce u pt o − 5)
Redis caching for fast quota checks (30s TTL)
Automatic cost calculation via background worker
Low balance email notifications
Architecture
┌───────────────────────────────────────────────────────┐
│ Frontend (Studio) │
│ Display balances, usage, payments │
└────────────────────────┬──────────────────────────────┘
│ REST API
▼
┌───────────────────────────────────────────────────────┐
│ Voice Agent Runtime (VM) │
│ ┌────────────────────────────────────────────────┐ │
│ │ Lemon Squeezy Service │ │
│ │ • Webhook: POST /lemon-squeezy/webhook │ │
│ │ • Verifies signature (HMAC-SHA256) │ │
│ │ • Processes orders │ │
│ │ • Adds credits on payment │ │
│ └────────────────────────────────────────────────┘ │
└────────────────────────┬──────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────┐
│ Billing Service (Cloud Run) │
│ │
│ API Endpoints: │
│ • POST /billing/sessions/start — Start session │
│ • POST /billing/sessions/end — End + calculate cost │
│ • POST /billing/credits/check — Check quota │
│ • POST /billing/credits/consume — Deduct credits │
│ • GET /billing/credits/balance/{org_id} — Balance │
│ • GET /billing/credits/history/{org_id} — History │
│ • GET /billing/dashboard — Usage analytics │
│ • GET /billing/pricing — Pricing rules │
│ │
│ Services: │
│ • SessionService — Track call lifecycle │
│ • CreditService — Credit balance management │
│ • PricingService — Cost calculation │
│ • NotificationService — Low balance alerts │
│ • OrganizationService — Org management │
│ │
│ Background Workers: │
│ • CostCalculatorWorker — Async cost processing │
│ │
│ Redis Cache (30s TTL): │
│ • Credit balance caching │
│ • Quota caching │
└───────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────┐
│ PostgreSQL Database (Neon Serverless) │
│ • BillingSession — Session tracking with costs │
│ • CreditTransaction — Credit movement history │
│ • PricingRule — Dynamic pricing configuration │
│ • LemonSqueezyOrder — One-time credit purchases │
│ • LemonSqueezyWebhookEvent — Webhook audit log │
└───────────────────────────────────────────────────────┘
Credit System
How Credits Work
1 credit = $1 USD = ~1 minute of voice conversation
Credits are:
Purchased via Lemon Squeezy (payment processor)
Stored per organization in the database
Consumed at session end (not during the call)
Cached in Redis (30-second TTL) for fast quota checks
Credit Flow
1. Pre-call: Check credits available (optional)
│
├── Insufficient → Reject call
└── OK → Allow call
│
▼
2. Call runs (credits NOT deducted yet)
│
▼
3. Call ends → POST /billing/sessions/end
│
▼
4. Session marked as "completed"
• endTime recorded
• duration calculated
│
▼
5. CostCalculatorWorker (background, polls every 10s)
• Finds completed sessions
• Calculates cost based on:
- Duration (seconds)
- Base cost (infrastructure)
- Model cost (AI processing, if platform keys)
- STT/TTS costs (if platform keys)
│
▼
6. POST /billing/credits/consume
• Optimistic locking prevents race conditions
• Credits deducted from org balance
• CreditTransaction record created
• Redis cache invalidated
│
▼
7. Notification check
• Balance < $1.00 → Email alert
• Balance < $0.00 → Grace period warning
• Balance < -$4.00 → Critical alert
Grace Period Logic
TalkifAI uses a grace period system to prevent sudden service interruption:
Balance Status Calls Allowed ≥ $0.00 ✅ Normal ✅ Yes -5.00 t o 5.00 to 5.00 t o 0.00 ⚠️ Warning ✅ Yes < -$5.00 ❌ Blocked ❌ No
Purpose: The $5 grace period prevents sudden service interruption due to timing issues or unexpected usage spikes.
Pricing Model
BYOC Architecture (Bring Your Own Carrier)
Customers provide their own SIP credentials (Twilio, Telnyx, etc.) and pay carriers directly .
Platform only charges for:
Base platform fee (infrastructure)
AI usage (STT/TTS/LLM if using platform keys)
No telephony markup — carrier costs are NOT charged by TalkifAI.
Cost Breakdown
The total cost of a session is calculated as:
TOTAL = Base Cost + Model Cost + STT Cost + Voice Cost
Where:
• Base Cost — Infrastructure fee (always charged)
• Model Cost — LLM processing (only if keyMode = "talkifai_keys")
• STT Cost — Speech-to-text (only if keyMode = "talkifai_keys")
• Voice Cost — TTS voice (only if keyMode = "talkifai_keys")
Example Rates (via PricingRule table)
Component Rate When Charged Base (Webcall + platform keys) ~$0.05/min Always Base (Webcall + own keys) ~$0.02/min Always Base (Telephony + platform keys) ~$0.10/min Always LLM (GPT-4o-mini) ~$0.015/min Platform keys only LLM (Gemini Flash) ~$0.010/min Platform keys only STT (Deepgram) ~$0.003/min Platform keys only TTS (Cartesia) ~$0.005/min Platform keys only
Example Calculation:
Session: Telephony call, 5 minutes, platform keys
• Base cost: $0.10/min × 5 min = $0.50
• LLM cost: $0.015/min × 5 min = $0.075
• STT cost: $0.003/min × 5 min = $0.015
• TTS cost: $0.005/min × 5 min = $0.025
• Carrier cost: $0.00 (customer pays Twilio directly)
• TOTAL: $0.615 (6.15 credits)
Actual rates are stored in the PricingRule database table and can be updated dynamically without code changes.
Payment Processing
TalkifAI uses Lemon Squeezy for payment processing:
Purchase Flow
1. User selects credit package in Studio
│
▼
2. Redirected to Lemon Squeezy checkout
│
▼
3. User completes payment
│
▼
4. Lemon Squeezy sends webhook to TalkifAI
POST /lemon-squeezy/webhook
│
▼
5. Webhook verified (HMAC-SHA256 signature)
│
▼
6. LemonSqueezyOrder record created
│
▼
7. Credits added to organization balance
│
▼
8. CreditTransaction record created (type: "topup")
│
▼
9. Redis cache invalidated
│
▼
10. User can immediately use new credits
Supported Events
Event Action order_createdAdd credits from one-time purchase
Webhook payload includes:
custom_data.organization_id — Links payment to org
custom_data.user_id — User who made purchase
data.attributes.total — Amount in cents ($19.99 = 1999)
Credit Packages
Credits are purchased as one-time payments (no subscriptions):
Package Price Credits Added Starter $9.99 $9.99 in credits Pro $39.99 $39.99 in credits Enterprise $99.99 $99.99 in credits
Credits never expire and remain in the organization’s balance indefinitely.
Session Tracking
Start Session
POST /billing/sessions/start
Content-Type: application/json
{
"organizationId" : "org_abc123",
"userId" : "user_xyz789",
"agentId" : "agent_def456",
"roomName" : "room_session_123",
"sessionType" : "webcall", // or "telephony"
"keyMode" : "talkifai_keys", // or "my_keys"
"agentModel" : "gpt_4o_mini",
"agentArchitecture" : "pipeline",
"sttProvider" : "deepgram",
"voiceId" : "cartesia-voice-123"
}
Response:
{
"success" : true ,
"sessionId" : "session_abc123" ,
"message" : "Session started successfully"
}
End Session
POST /billing/sessions/end
Content-Type: application/json
{
"roomName" : "room_session_123",
"endTime" : "2024-01-15T10:30:00Z"
}
Response:
{
"success" : true ,
"sessionId" : "session_abc123" ,
"duration" : 180 ,
"message" : "Session ended successfully. Costs will be calculated shortly."
}
Costs are calculated asynchronously by the background worker (usually within 10-20 seconds).
Concurrency Safety
Optimistic Locking
To prevent race conditions when multiple calls end simultaneously:
WITHOUT Optimistic Locking (WRONG):
────────────────────────────────────────────────────────
Thread A reads: balance = $10.00
Thread B reads: balance = $10.00 ← Both read same value!
Thread A writes: $10.00 - $1.00 = $9.00
Thread B writes: $10.00 - $1.50 = $8.50 ← Overwrites Thread A!
Final balance: $8.50 ❌ WRONG! (Should be $7.50)
WITH Optimistic Locking (CORRECT):
────────────────────────────────────────────────────────
Thread A reads: balance = $10.00, timestamp = T1
Thread B reads: balance = $10.00, timestamp = T1
Thread A updates: WHERE timestamp = T1 → Success
New balance: $9.00, timestamp = T2
Thread B updates: WHERE timestamp = T1 → Fails! (timestamp changed)
Thread B retries: reads balance = $9.00, timestamp = T2
Thread B updates: WHERE timestamp = T2 → Success
New balance: $7.50, timestamp = T3
Final balance: $7.50 ✅ CORRECT!
Redis Caching
Cache Flow
Request: Check credit for org_123
│
▼
┌──────────────────┐
│ Check Redis │ Key: "credit:balance:org_123"
│ TTL: 30 seconds │
└────┬─────────────┘
│
├─ HIT? → Return cached balance (fast path - 5ms)
│
└─ MISS? → Query database (slow path - 50ms)
└─ Store in cache (TTL: 30s)
└─ Return balance
Cache Invalidation
Cache is invalidated when:
Credits are consumed (session ends)
Credits are added (payment successful)
Manual balance refresh requested
Performance Impact:
Without cache: 100 requests = 100 DB queries
With cache: 100 requests = ~3-4 DB queries (96% reduction)
Notifications
The billing service sends email notifications for:
Trigger Threshold Email Content Low Balance Balance < $1.00 ”Your balance is low. Please add credits.” Grace Period Balance < $0.00 ”You’re using grace period credits. Add credits now.” Critical Balance < -$4.00 ”Service suspension imminent. Add credits immediately.”
Notifications are sent to the organization owner’s email address.
API Reference
Check Credit Quota
POST /billing/credits/check
Content-Type: application/json
{
"organizationId" : "org_abc123"
}
Response:
{
"success" : true ,
"data" : {
"allowed" : true ,
"balance" : 25.50 ,
"gracePeriodUsed" : 0.00 ,
"gracePeriodLimit" : 5.00
}
}
Consume Credits
POST /billing/credits/consume
Content-Type: application/json
{
"organizationId" : "org_abc123",
"sessionCost" : 0.615,
"roomName" : "room_session_123"
}
Response:
{
"success" : true ,
"data" : {
"balanceBefore" : 25.50 ,
"balanceAfter" : 24.885 ,
"amountConsumed" : 0.615
}
}
Get Balance
GET /billing/credits/balance/{org_id}
Authorization: Bearer YOUR_API_KEY
Response:
{
"success" : true ,
"data" : {
"organizationId" : "org_abc123" ,
"balance" : 24.885 ,
"lastUpdated" : "2024-01-15T10:30:00Z"
}
}
Get Transaction History
GET /billing/credits/history/{org_id}?limit= 50 & offset = 0
Authorization: Bearer YOUR_API_KEY
Response:
{
"success" : true ,
"data" : {
"transactions" : [
{
"id" : "txn_abc123" ,
"type" : "consumption" ,
"amount" : -0.615 ,
"balanceBefore" : 25.50 ,
"balanceAfter" : 24.885 ,
"sessionId" : "session_xyz789" ,
"createdAt" : "2024-01-15T10:30:00Z"
},
{
"id" : "txn_def456" ,
"type" : "topup" ,
"amount" : 39.99 ,
"balanceBefore" : 5.00 ,
"balanceAfter" : 44.99 ,
"orderId" : "lemon_order_123" ,
"createdAt" : "2024-01-10T08:00:00Z"
}
],
"total" : 150
}
}
Troubleshooting
Call rejected due to insufficient credits
Check:
Current balance: GET /billing/credits/balance/{org_id}
Grace period status (balance may be negative but still allowed)
Recent transactions for unexpected charges
Fix: Add credits via Lemon Squeezy checkout.
Credits not added after payment
Check:
Lemon Squeezy order status (check email receipt)
Webhook logs: GET /lemon-squeezy/webhook-events
LemonSqueezyWebhookEvent.processed flag in database
Fix: Contact support with order ID if webhook failed.
Check:
Transaction history for all debits/credits
Recent session costs (may still be calculating)
Redis cache staleness (wait 30s or force refresh)
Fix: Force cache refresh: DELETE /billing/credits/cache/{org_id}
Expected behavior: Costs are calculated within 10-20 seconds by the background worker.Check:
Worker logs for errors
BillingSession.sessionStatus = “completed”
BillingSession.costsCalculatedAt timestamp
Fix: Manual cost calculation via admin tools if worker stuck.