Per-speaker, real-time captions powered by Whisper running inside your own stack. A speaking indicator appears the moment someone's mic is active. Transcribed text replaces it 1–3 seconds later. No API keys, no external service, no data leaves your server.
🎥
HD Video & Audio
Multi-party video conferencing with adaptive bitrate, simulcast, and selective forwarding — all via LiveKit's battle-tested SFU. Supports thousands of participants with the right hardware.
🖥
Screen Sharing
Any participant can share their screen alongside the video call. Screen share tiles appear automatically in the grid. One-click toggle from the control bar.
💬
In-call Chat
Real-time text chat alongside the video — send public messages or DMs to specific participants. Chat history persists for the duration of the session.
⚡
Emoji Reactions
Participants can send emoji reactions that float across the screen. Lightweight, fun, and fully delivered over LiveKit's data channel.
🔴
Recording
Start and stop recording via API or SDK. Recordings go straight to your MinIO/S3 bucket via LiveKit Egress. Combine with post-call transcription for fully indexed archives.
📄
Post-call Transcription
Trigger transcription on any recording after the call ends. Timestamped, speaker-attributed segments stored in Redis and retrievable via API. CPU or GPU Whisper.
👥
Participant Management
Role-based access (host, moderator, participant, viewer). Hosts can mute participants, remove them from the room, or demote them. Room metadata and webhook events for every state change.
🎛
Device Selection
Users can switch microphone, speaker, and camera mid-call. Devices are enumerated from the browser and swapped without dropping the connection.
🧩
Composable UI Kits
Drop in `` for a complete experience, or compose from atomic pieces — ``, ``, ``. React, Vue 3, and Vanilla JS.
⚙
SDK-first
Framework-agnostic TypeScript SDK wraps LiveKit. Every feature has an event — `transcriptReceived`, `speakingStarted`, `messageReceived`, `recordingStarted`. Under 80 kB gzipped.
🔐
Secure by default
HMAC-signed API requests, short-lived JWTs, role-scoped token grants, replay-attack prevention. No plaintext secrets. Everything self-hosted.
RTCstack is a self-hosted real-time communication platform that gives you everything you'd normally pay a SaaS provider for — without sending a byte of audio, video, or transcript to anyone else.
It wraps LiveKit (an open-source WebRTC SFU) with:
A thin REST API for token signing, room/participant/recording/transcription management
A TypeScript SDK (createCall() → connect() → events) that abstracts the entire session
Live and post-call transcription via faster-whisper — included, not optional
UI kits for React, Vue 3, and Vanilla JS — every component is usable standalone
bash
# Start the full stack including live transcriptioncd docker && docker compose --profile stt-live up# Request a tokenPOST /v1/token → { token, url }# Connect and display everything in React
tsx
import { createCall } from '@rtcstack/sdk'import { VideoConference } from '@rtcstack/ui-react'const call = createCall({ token, url })await call.connect()// Full conference: video, controls, chat, screen share, reactions, transcription<VideoConference call={call} showTranscript />
ts
// Or wire up every feature from raw SDK eventscall.on('participantJoined', (p) => console.log(p.name, 'joined'))call.on('messageReceived', (m) => appendChat(m.fromName, m.text))call.on('speakingStarted', (id, name) => showIndicator(name))call.on('transcriptReceived', (seg) => appendTranscript(seg.speaker, seg.text))call.on('recordingStarted', () => showRecordingBadge())call.on('screenShareStarted', (p) => showScreenTile(p))call.on('activeSpeakerChanged', (speakers) => highlightSpeaker(speakers[0]))