Post-call Transcription

Trigger transcription on any recording after the call ends. The stt-worker fetches the recording from MinIO, sends it to Whisper, and stores timestamped segments in Redis.

Enable

bash

# docker/.env
TRANSCRIPTION_POST_ENABLED=true
WHISPER_MODEL=base

bash

docker compose --profile stt-post up -d

Trigger transcription

http

POST /v1/recordings/:recordingId/transcribe
Content-Type: application/json

{ "language": "en" }

json

{ "transcriptionId": "tr_xyz789", "status": "queued" }

Poll for results

http

GET /v1/transcriptions/:transcriptionId

While processing:

json

{ "transcriptionId": "tr_xyz789", "status": "processing" }

When complete:

json

{
  "transcriptionId": "tr_xyz789",
  "status": "completed",
  "language": "en",
  "duration": "142.5",
  "segmentCount": "47",
  "text": "[00:00:01] Alice: Can everyone hear me?\n[00:00:05] Bob: Yes, loud and clear.\n..."
}

Statuses: queued → processing → completed | failed

JavaScript example

typescript

async function transcribeRecording(recordingId: string, apiUrl: string, headers: HeadersInit) {
  // Trigger
  const { transcriptionId } = await fetch(`${apiUrl}/v1/recordings/${recordingId}/transcribe`, {
    method: 'POST',
    headers: { ...headers, 'Content-Type': 'application/json' },
    body: JSON.stringify({ language: 'en' }),
  }).then(r => r.json())

  // Poll until done
  while (true) {
    const result = await fetch(`${apiUrl}/v1/transcriptions/${transcriptionId}`, { headers }).then(r => r.json())
    if (result.status === 'completed') return result
    if (result.status === 'failed') throw new Error('Transcription failed')
    await new Promise(r => setTimeout(r, 3000))  // poll every 3s
  }
}

Processing time

Post-call transcription processes the whole file at once — faster per-minute than live mode, but not real-time.

Hardware	Model	60 min recording	Notes
CPU 4-core	`base`	~15–20 min	Fine for async use
CPU 8-core	`small`	~20–25 min	Better accuracy
GPU 8GB	`medium`	~4–6 min	Recommended
GPU 24GB	`large-v3`	~2–3 min	Best quality

For hardware and model selection guidance, see Deployment.

Environment variables

Variable	Default	Description
`TRANSCRIPTION_POST_ENABLED`	`false`	Enable post-call transcription endpoints
`WHISPER_MODEL`	`base`	Whisper model
`STT_LANGUAGE`	`en`	Default language for transcription jobs
`WHISPER_URL`	`http://whisper:8080`	Whisper service URL

Post-call Transcription ​

Enable ​

Trigger transcription ​

Poll for results ​

JavaScript example ​

Processing time ​

Environment variables ​

Post-call Transcription

Enable

Trigger transcription

Poll for results

JavaScript example

Processing time

Environment variables