- Real-time — start, update, and stop transcription during an active call using the
/rooms/{room_name}/transcriptionendpoints. - Post-call — submit a recording ID or media URL to the Batch Processor after a call ends to generate a transcript (and optionally a summary).
Real-time transcription
Real-time transcription runs during an active call and streams text to all in-call participants as speech is detected. Transcripts can optionally be saved to storage as WebVTT files.Starting transcription
BCP-47 language tag. Examples:
'en', 'es', 'fr', 'de', 'ja', 'pt-BR'. Available languages depend on the Deepgram model in use.Deepgram model name. Examples:
'nova-2-general', 'nova-2', 'enhanced', 'base'. Higher-tier models offer better accuracy at higher cost.When
true, Deepgram replaces profane words with asterisks.Entities to redact. Pass
true to redact all supported entities, or an array of strings (e.g. ['pci', 'ssn']).How long Deepgram waits for silence before ending an utterance. Pass a number (milliseconds) or
true/false.When
true, Deepgram adds punctuation to the transcript.Identifier for this transcription instance. Required when running multiple simultaneous transcriptions in the same room.
Array of session IDs. When provided, only those participants’ audio is transcribed. Omit to transcribe everyone.
Updating transcription
Change which participants are transcribed mid-call:instanceId to target a specific instance when running multiple transcriptions simultaneously.
See the update transcription endpoint for the full API reference.
Stopping transcription
--data '{ "instanceId": "primary" }'
If transcript storage is enabled, the final WebVTT file is written when transcription stops. Listen for the transcript.ready-to-download webhook to know when the file is available.
See the stop transcription endpoint for the full API reference.
Auto-starting transcription
Instead of starting transcription manually, you can have it begin automatically when a meeting owner joins by settingauto_start_transcription on their meeting token. Pair it with auto_transcription_settings on the room to configure the model and language.
Meeting token (triggers auto-start for that owner):
auto_start_transcription meeting token property and auto_transcription_settings room property for full details.
Transcript storage
By default, real-time transcripts are streamed to call participants but not saved. To persist transcripts as WebVTT files, enable storage at the room or domain level. Room level:transcription_bucket domain property:
Post-call transcription
The Batch Processor generates transcripts and summaries from recordings or any publicly accessible audio/video URL — no active call required.Submitting a job
Specify apreset (transcript or summarize), an input source, and an output path:
- From a recording ID
- From a media URL
- With summary
transcript jobs: txt, srt, vtt, json. The summarize preset produces a plain text file.
Retrieving batch job results
Poll job status or get a signed download URL using the job ID returned from submit:| Endpoint | Description |
|---|---|
| Get job | GET /batch-processor/{jobId} — check status and see output locations |
| Get job access link | GET /batch-processor/{jobId}/access-link — get a signed download URL |
| List jobs | GET /batch-processor — list all jobs for your domain |
| Delete job | DELETE /batch-processor/{jobId} — delete a job and its output |
Events via webhooks
Subscribe to transcription lifecycle events from the Daily dashboard or via the webhooks REST API. Events are delivered as POST requests to your webhook URL.Real-time transcription events
transcript.started
Fired when real-time transcription begins. Includes the instanceId and, if storage is enabled, the S3 path where the transcript will be written.
transcript.started reference.
transcript.ready-to-download
Fired when transcription ends and the saved transcript file is available. This is the completion signal — use out_params.s3 to locate the file, or use the get transcript link endpoint to get a signed URL.
Despite the name,
transcript.ready-to-download is the equivalent of “transcription stopped” — it fires when the transcript reaches a finished state. You may also receive a transcript.error event if a storage error occurred alongside completion.transcript.ready-to-download reference.
transcript.error
Fired if an error occurs during transcription or before it could start. Includes an error field with details. May fire alongside transcript.started or transcript.ready-to-download depending on when the error occurred.
transcript.error reference.
Batch Processor events
batch-processor.job-finished
Fired when a Batch Processor job completes. The output field contains S3 locations for all generated files (transcript in all formats, and summary if requested).
batch-processor.job-finished reference.
batch-processor.error
Fired when a Batch Processor job fails. Includes an error field with details.
batch-processor.error reference.
Accessing saved transcripts
Both real-time transcripts (whenenable_transcription_storage is on) and Batch Processor transcripts are accessible via the /transcript endpoints. Use these to list, download, and clean up transcripts after the fact.
Listing transcripts
transcriptId, room and session identifiers, duration, and status.
Getting a download link
Deleting a transcript
| Endpoint | Description |
|---|---|
| List transcripts | GET /transcript |
| Get transcript | GET /transcript/{transcriptId} |
| Get transcript link | GET /transcript/{transcriptId}/access-link |
| Delete transcript | DELETE /transcript/{transcriptId} |
The transcript object
A transcript object represents a single transcription session:"t_finished"— transcription is complete and the file is available"isVttAvailable": true— the WebVTT file can be downloaded
Related
Transcription overview
Billing, storage options, permissions, and a comparison of real-time vs. post-call approaches.
daily-js guide
Start and manage transcription from a call object with full parameter reference and a live captions example.
Daily React guide
Use the
useTranscription hook for reactive transcription state in React apps.