Skip to content

Post-call Transcription

Trigger transcription on any recording after the call ends. The stt-worker fetches the recording from MinIO, sends it to Whisper, and stores timestamped segments in Redis.

Enable

bash
# docker/.env
TRANSCRIPTION_POST_ENABLED=true
WHISPER_MODEL=base
bash
docker compose --profile stt-post up -d

Trigger transcription

http
POST /v1/recordings/:recordingId/transcribe
Content-Type: application/json

{ "language": "en" }
json
{ "transcriptionId": "tr_xyz789", "status": "queued" }

Poll for results

http
GET /v1/transcriptions/:transcriptionId

While processing:

json
{ "transcriptionId": "tr_xyz789", "status": "processing" }

When complete:

json
{
  "transcriptionId": "tr_xyz789",
  "status": "completed",
  "language": "en",
  "duration": "142.5",
  "segmentCount": "47",
  "text": "[00:00:01] Alice: Can everyone hear me?\n[00:00:05] Bob: Yes, loud and clear.\n..."
}

Statuses: queuedprocessingcompleted | failed

JavaScript example

typescript
async function transcribeRecording(recordingId: string, apiUrl: string, headers: HeadersInit) {
  // Trigger
  const { transcriptionId } = await fetch(`${apiUrl}/v1/recordings/${recordingId}/transcribe`, {
    method: 'POST',
    headers: { ...headers, 'Content-Type': 'application/json' },
    body: JSON.stringify({ language: 'en' }),
  }).then(r => r.json())

  // Poll until done
  while (true) {
    const result = await fetch(`${apiUrl}/v1/transcriptions/${transcriptionId}`, { headers }).then(r => r.json())
    if (result.status === 'completed') return result
    if (result.status === 'failed') throw new Error('Transcription failed')
    await new Promise(r => setTimeout(r, 3000))  // poll every 3s
  }
}

Processing time

Post-call transcription processes the whole file at once — faster per-minute than live mode, but not real-time.

HardwareModel60 min recordingNotes
CPU 4-corebase~15–20 minFine for async use
CPU 8-coresmall~20–25 minBetter accuracy
GPU 8GBmedium~4–6 minRecommended
GPU 24GBlarge-v3~2–3 minBest quality

For hardware and model selection guidance, see Deployment.

Environment variables

VariableDefaultDescription
TRANSCRIPTION_POST_ENABLEDfalseEnable post-call transcription endpoints
WHISPER_MODELbaseWhisper model
STT_LANGUAGEenDefault language for transcription jobs
WHISPER_URLhttp://whisper:8080Whisper service URL