Skip to content

Quick Start

Get a working video call with live transcription in under 15 minutes.

Prerequisites

  • Docker ≥ 24 and Docker Compose ≥ 2.20
  • Node.js ≥ 20, pnpm ≥ 9 (for the SDK/UI kits)
  • A public IP or LAN IP for LiveKit ICE candidates

1. Clone and configure

bash
git clone https://github.com/bUxEE/rtcstack.git
cd rtcstack/docker
cp .env.example .env

Edit .env — at minimum fill in:

dotenv
LIVEKIT_API_KEY=devkey
LIVEKIT_API_SECRET=your-32-char-secret
LIVEKIT_RTC_EXTERNAL_IP=192.168.1.100   # your LAN or public IP
REDIS_PASSWORD=changeme
MINIO_ROOT_USER=minioadmin
MINIO_ROOT_PASSWORD=minioadmin
API_KEY=your-api-key
API_SECRET=your-api-secret
LIVEKIT_WSS_URL=ws://localhost:7880     # for local dev without Caddy

# Optional: enable live transcription
TRANSCRIPTION_LIVE_ENABLED=true

2. Start the stack

Video only:

bash
docker compose up livekit redis minio minio-init

Video + live transcription:

bash
docker compose --profile stt-live up

Wait until services show healthy:

bash
docker compose ps

3. Start the API

bash
cd ../apps/api
cp .env.example .env   # fill matching values
pnpm install
pnpm dev               # → http://localhost:3246

Verify: curl http://localhost:3246/v1/health

json
{
  "status": "ok",
  "version": "0.1.0",
  "capabilities": {
    "transcriptionLive": true,
    "transcriptionPost": false
  }
}

4. Run the React example

bash
cd ../apps/examples/react
pnpm install
pnpm dev               # → http://localhost:5173

Open two browser tabs at http://localhost:5173. Enter the same Room ID with different names → Join Call. You should see and hear each other, and if you enabled STT, transcription will appear automatically.

5. Integrate into your own app

ts
// Your backend — call RTCstack API
const { token, url } = await fetch('/v1/token', {
  method: 'POST',
  headers: { 'X-Api-Key': API_KEY },
  body: JSON.stringify({ roomId, userId, name, role: 'participant' }),
}).then(r => r.json())

// Your frontend
import { createCall } from '@rtcstack/sdk'
const call = createCall({ token, url })
await call.connect()

React — full conference with transcription:

tsx
import { VideoConference } from '@rtcstack/ui-react'

<VideoConference
  call={call}
  showTranscript   // live transcript panel, built in
  onLeave={() => call.disconnect()}
/>

Or wire up transcription events yourself:

ts
call.on('speakingStarted', (id, name) => {
  console.log(name, 'is speaking...')  // fires immediately, before Whisper
})

call.on('transcriptReceived', ({ speaker, text }) => {
  console.log(`${speaker}: ${text}`)   // ~1–3s after speaking ends
})

6. Start a transcription session

bash
curl -X POST http://localhost:3246/v1/rooms/my-room/transcription/start \
  -H "X-Api-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{ "language": "en" }'

Or from the SDK (add apiUrl + roomName to createCall):

ts
const call = createCall({
  token, url,
  roomName: 'my-room',
  apiUrl: 'http://localhost:3246',
})

await call.startTranscription()

See the Live Transcription guide for the full integration reference.