Skip to main content

Overview

TalkifAI uses a credit-based billing system. Organizations purchase credits via Lemon Squeezy, which are consumed as voice sessions run. The billing system is a separate FastAPI microservice running on Google Cloud Run. Key features:
  • Pay-as-you-go credit system (1 credit = $1 = ~1 minute of voice conversation)
  • BYOC (Bring Your Own Carrier) — no telephony markup
  • Grace period of 5(callsallowedevenwithnegativebalanceupto5 (calls allowed even with negative balance up to -5)
  • Redis caching for fast quota checks (30s TTL)
  • Automatic cost calculation via background worker
  • Low balance email notifications

Architecture

┌───────────────────────────────────────────────────────┐
│                   Frontend (Studio)                    │
│              Display balances, usage, payments         │
└────────────────────────┬──────────────────────────────┘
                         │ REST API

┌───────────────────────────────────────────────────────┐
│              Voice Agent Runtime (VM)                  │
│  ┌────────────────────────────────────────────────┐   │
│  │  Lemon Squeezy Service                         │   │
│  │  • Webhook: POST /lemon-squeezy/webhook        │   │
│  │  • Verifies signature (HMAC-SHA256)            │   │
│  │  • Processes orders                            │   │
│  │  • Adds credits on payment                     │   │
│  └────────────────────────────────────────────────┘   │
└────────────────────────┬──────────────────────────────┘


┌───────────────────────────────────────────────────────┐
│              Billing Service (Cloud Run)               │
│                                                        │
│  API Endpoints:                                        │
│  • POST /billing/sessions/start — Start session       │
│  • POST /billing/sessions/end — End + calculate cost  │
│  • POST /billing/credits/check — Check quota          │
│  • POST /billing/credits/consume — Deduct credits     │
│  • GET /billing/credits/balance/{org_id} — Balance    │
│  • GET /billing/credits/history/{org_id} — History    │
│  • GET /billing/dashboard — Usage analytics           │
│  • GET /billing/pricing — Pricing rules               │
│                                                        │
│  Services:                                             │
│  • SessionService — Track call lifecycle              │
│  • CreditService — Credit balance management          │
│  • PricingService — Cost calculation                  │
│  • NotificationService — Low balance alerts           │
│  • OrganizationService — Org management               │
│                                                        │
│  Background Workers:                                   │
│  • CostCalculatorWorker — Async cost processing       │
│                                                        │
│  Redis Cache (30s TTL):                                │
│  • Credit balance caching                             │
│  • Quota caching                                      │
└───────────────────────────────────────────────────────┘


┌───────────────────────────────────────────────────────┐
│         PostgreSQL Database (Neon Serverless)          │
│  • BillingSession — Session tracking with costs       │
│  • CreditTransaction — Credit movement history        │
│  • PricingRule — Dynamic pricing configuration        │
│  • LemonSqueezyOrder — One-time credit purchases      │
│  • LemonSqueezyWebhookEvent — Webhook audit log       │
└───────────────────────────────────────────────────────┘

Credit System

How Credits Work

1 credit= $1 USD = ~1 minute of voice conversation
Credits are:
  • Purchased via Lemon Squeezy (payment processor)
  • Stored per organization in the database
  • Consumed at session end (not during the call)
  • Cached in Redis (30-second TTL) for fast quota checks

Credit Flow

1. Pre-call: Check credits available (optional)

         ├── Insufficient → Reject call
         └── OK → Allow call


         2. Call runs (credits NOT deducted yet)


         3. Call ends → POST /billing/sessions/end


         4. Session marked as "completed"
            • endTime recorded
            • duration calculated


         5. CostCalculatorWorker (background, polls every 10s)
            • Finds completed sessions
            • Calculates cost based on:
              - Duration (seconds)
              - Base cost (infrastructure)
              - Model cost (AI processing, if platform keys)
              - STT/TTS costs (if platform keys)


         6. POST /billing/credits/consume
            • Optimistic locking prevents race conditions
            • Credits deducted from org balance
            • CreditTransaction record created
            • Redis cache invalidated


         7. Notification check
            • Balance < $1.00 → Email alert
            • Balance < $0.00 → Grace period warning
            • Balance < -$4.00 → Critical alert

Grace Period Logic

TalkifAI uses a grace period system to prevent sudden service interruption:
BalanceStatusCalls Allowed
≥ $0.00✅ Normal✅ Yes
-5.00to5.00 to 0.00⚠️ Warning✅ Yes
< -$5.00❌ Blocked❌ No
Purpose: The $5 grace period prevents sudden service interruption due to timing issues or unexpected usage spikes.

Pricing Model

BYOC Architecture (Bring Your Own Carrier)

Customers provide their own SIP credentials (Twilio, Telnyx, etc.) and pay carriers directly. Platform only charges for:
  • Base platform fee (infrastructure)
  • AI usage (STT/TTS/LLM if using platform keys)
No telephony markup — carrier costs are NOT charged by TalkifAI.

Cost Breakdown

The total cost of a session is calculated as:
TOTAL = Base Cost + Model Cost + STT Cost + Voice Cost

Where:
• Base Cost — Infrastructure fee (always charged)
• Model Cost — LLM processing (only if keyMode = "talkifai_keys")
• STT Cost — Speech-to-text (only if keyMode = "talkifai_keys")
• Voice Cost — TTS voice (only if keyMode = "talkifai_keys")

Example Rates (via PricingRule table)

ComponentRateWhen Charged
Base (Webcall + platform keys)~$0.05/minAlways
Base (Webcall + own keys)~$0.02/minAlways
Base (Telephony + platform keys)~$0.10/minAlways
LLM (GPT-4o-mini)~$0.015/minPlatform keys only
LLM (Gemini Flash)~$0.010/minPlatform keys only
STT (Deepgram)~$0.003/minPlatform keys only
TTS (Cartesia)~$0.005/minPlatform keys only
Example Calculation:
Session: Telephony call, 5 minutes, platform keys
• Base cost: $0.10/min × 5 min = $0.50
• LLM cost: $0.015/min × 5 min = $0.075
• STT cost: $0.003/min × 5 min = $0.015
• TTS cost: $0.005/min × 5 min = $0.025
• Carrier cost: $0.00 (customer pays Twilio directly)
• TOTAL: $0.615 (6.15 credits)
Actual rates are stored in the PricingRule database table and can be updated dynamically without code changes.

Payment Processing

TalkifAI uses Lemon Squeezy for payment processing:

Purchase Flow

1. User selects credit package in Studio


2. Redirected to Lemon Squeezy checkout


3. User completes payment


4. Lemon Squeezy sends webhook to TalkifAI
   POST /lemon-squeezy/webhook


5. Webhook verified (HMAC-SHA256 signature)


6. LemonSqueezyOrder record created


7. Credits added to organization balance


8. CreditTransaction record created (type: "topup")


9. Redis cache invalidated


10. User can immediately use new credits

Supported Events

EventAction
order_createdAdd credits from one-time purchase
Webhook payload includes:
  • custom_data.organization_id — Links payment to org
  • custom_data.user_id — User who made purchase
  • data.attributes.total — Amount in cents ($19.99 = 1999)

Credit Packages

Credits are purchased as one-time payments (no subscriptions):
PackagePriceCredits Added
Starter$9.99$9.99 in credits
Pro$39.99$39.99 in credits
Enterprise$99.99$99.99 in credits
Credits never expire and remain in the organization’s balance indefinitely.

Session Tracking

Start Session

POST /billing/sessions/start
Content-Type: application/json

{
  "organizationId": "org_abc123",
  "userId": "user_xyz789",
  "agentId": "agent_def456",
  "roomName": "room_session_123",
  "sessionType": "webcall",  // or "telephony"
  "keyMode": "talkifai_keys",  // or "my_keys"
  "agentModel": "gpt_4o_mini",
  "agentArchitecture": "pipeline",
  "sttProvider": "deepgram",
  "voiceId": "cartesia-voice-123"
}
Response:
{
  "success": true,
  "sessionId": "session_abc123",
  "message": "Session started successfully"
}

End Session

POST /billing/sessions/end
Content-Type: application/json

{
  "roomName": "room_session_123",
  "endTime": "2024-01-15T10:30:00Z"
}
Response:
{
  "success": true,
  "sessionId": "session_abc123",
  "duration": 180,
  "message": "Session ended successfully. Costs will be calculated shortly."
}
Costs are calculated asynchronously by the background worker (usually within 10-20 seconds).

Concurrency Safety

Optimistic Locking

To prevent race conditions when multiple calls end simultaneously:
WITHOUT Optimistic Locking (WRONG):
────────────────────────────────────────────────────────
Thread A reads: balance = $10.00
Thread B reads: balance = $10.00  ← Both read same value!
Thread A writes: $10.00 - $1.00 = $9.00
Thread B writes: $10.00 - $1.50 = $8.50  ← Overwrites Thread A!
Final balance: $8.50  ❌ WRONG! (Should be $7.50)

WITH Optimistic Locking (CORRECT):
────────────────────────────────────────────────────────
Thread A reads: balance = $10.00, timestamp = T1
Thread B reads: balance = $10.00, timestamp = T1
Thread A updates: WHERE timestamp = T1 → Success
  New balance: $9.00, timestamp = T2
Thread B updates: WHERE timestamp = T1 → Fails! (timestamp changed)
Thread B retries: reads balance = $9.00, timestamp = T2
Thread B updates: WHERE timestamp = T2 → Success
  New balance: $7.50, timestamp = T3
Final balance: $7.50  ✅ CORRECT!

Redis Caching

Cache Flow

Request: Check credit for org_123


┌──────────────────┐
│ Check Redis      │  Key: "credit:balance:org_123"
│ TTL: 30 seconds  │
└────┬─────────────┘

     ├─ HIT? → Return cached balance (fast path - 5ms)

     └─ MISS? → Query database (slow path - 50ms)
                 └─ Store in cache (TTL: 30s)
                     └─ Return balance

Cache Invalidation

Cache is invalidated when:
  • Credits are consumed (session ends)
  • Credits are added (payment successful)
  • Manual balance refresh requested
Performance Impact:
  • Without cache: 100 requests = 100 DB queries
  • With cache: 100 requests = ~3-4 DB queries (96% reduction)

Notifications

The billing service sends email notifications for:
TriggerThresholdEmail Content
Low BalanceBalance < $1.00”Your balance is low. Please add credits.”
Grace PeriodBalance < $0.00”You’re using grace period credits. Add credits now.”
CriticalBalance < -$4.00”Service suspension imminent. Add credits immediately.”
Notifications are sent to the organization owner’s email address.

API Reference

Check Credit Quota

POST /billing/credits/check
Content-Type: application/json

{
  "organizationId": "org_abc123"
}
Response:
{
  "success": true,
  "data": {
    "allowed": true,
    "balance": 25.50,
    "gracePeriodUsed": 0.00,
    "gracePeriodLimit": 5.00
  }
}

Consume Credits

POST /billing/credits/consume
Content-Type: application/json

{
  "organizationId": "org_abc123",
  "sessionCost": 0.615,
  "roomName": "room_session_123"
}
Response:
{
  "success": true,
  "data": {
    "balanceBefore": 25.50,
    "balanceAfter": 24.885,
    "amountConsumed": 0.615
  }
}

Get Balance

GET /billing/credits/balance/{org_id}
Authorization: Bearer YOUR_API_KEY
Response:
{
  "success": true,
  "data": {
    "organizationId": "org_abc123",
    "balance": 24.885,
    "lastUpdated": "2024-01-15T10:30:00Z"
  }
}

Get Transaction History

GET /billing/credits/history/{org_id}?limit=50&offset=0
Authorization: Bearer YOUR_API_KEY
Response:
{
  "success": true,
  "data": {
    "transactions": [
      {
        "id": "txn_abc123",
        "type": "consumption",
        "amount": -0.615,
        "balanceBefore": 25.50,
        "balanceAfter": 24.885,
        "sessionId": "session_xyz789",
        "createdAt": "2024-01-15T10:30:00Z"
      },
      {
        "id": "txn_def456",
        "type": "topup",
        "amount": 39.99,
        "balanceBefore": 5.00,
        "balanceAfter": 44.99,
        "orderId": "lemon_order_123",
        "createdAt": "2024-01-10T08:00:00Z"
      }
    ],
    "total": 150
  }
}

Troubleshooting

Check:
  1. Current balance: GET /billing/credits/balance/{org_id}
  2. Grace period status (balance may be negative but still allowed)
  3. Recent transactions for unexpected charges
Fix: Add credits via Lemon Squeezy checkout.
Check:
  1. Lemon Squeezy order status (check email receipt)
  2. Webhook logs: GET /lemon-squeezy/webhook-events
  3. LemonSqueezyWebhookEvent.processed flag in database
Fix: Contact support with order ID if webhook failed.
Check:
  1. Transaction history for all debits/credits
  2. Recent session costs (may still be calculating)
  3. Redis cache staleness (wait 30s or force refresh)
Fix: Force cache refresh: DELETE /billing/credits/cache/{org_id}
Expected behavior: Costs are calculated within 10-20 seconds by the background worker.Check:
  1. Worker logs for errors
  2. BillingSession.sessionStatus = “completed”
  3. BillingSession.costsCalculatedAt timestamp
Fix: Manual cost calculation via admin tools if worker stuck.