Real-time transcription
Pay-as-you-go

Daily's real-time transcription engine generates a live transcript (closed captions) during a meeting session. Additionally, the service gives you the ability to store the transcript to an S3 bucket in WebVTT format.

This guide will cover:

Starting transcription and listening to events transcription generation
Enabling domains & rooms for transcription storage (WebVTT)
Enabling custom buckets to store transcriptions

Start transcription

There are 3 ways to start transcription:

During a Daily Call you can use the daily-js method - startTranscription();
Transcriptions can be invoked using our REST API.
You can invoke live transcriptions by creating a meeting token containing the auto_start_transcription property for any meeting owner. You can set the transcription configuration using the auto_transcription_settings.

Here is the list for the full set of transcription parameters.

Daily JS Example

Transcription can be started by calling the startTranscription() method. The event "transcription-started" is triggered once server starts the transcription process.

The transcriptId is returned in the transcription-started event, which can be used with /transcript REST APIs.

Example API return from the `transcription-started` event

Generated transcriptions are broadcasted to the call via "transcription-message".

Setting transcription permissions

In order to start transcribing a meeting using startTranscription(), the user must have the permission of canAdmin: 'transcription' set. Meeting owners always have this permission set; in fact, they have all admin privileges available. Non-owners can become transcription admins through two ways:

The user joins with a meeting token, where the permissions property is configured to include canAdmin: 'transcription'.
A meeting owner grants the transcription permission by calling updateParticipant() and setting the canAdmin: 'transcription' permission.

For example, if you want to add closed captions to your meeting and you want any user to initiate the captioning, you can either:

Configure the room to auto start the transcription. In this case, transcription will always be occurring.
Set all users with the transcription admin privilege via a meeting token, so any user can initiate the transcription. In this case, you can call startTranscription() so that transcription happens on demand.

Selecting participants to transcribe

By default startTranscription() transcribes all the participants in the call. To transcribe participants selectively, your application can pass participantIds to the startTranscription() method to select the participants to transcribe.

Note:

When the participants array is either not passed or is empty, startTranscription() will transcribe all participants.
If participants are passed in the startTranscription() , new participants are not automatically transcribed. New participants can be added to transcription using updateTranscription().

Multi-instance transcription

For use cases where participants are sub-divided into groups within the same Daily room, multiple transcription instances can run to provide transcription messages for the specified individuals. To run multiple transcription instances, provide an instanceId as a parameter when calling the startTranscription() method. Daily select a1f2f6b7-b1ac-4202-85e5-d446cb6c3d3f as default instanceId if user has not provided it.

Note:

Each transcription instance is billed separately.

Update transcription

It's possible to change the participants that are being transcribed while the transcription is on-going. This can help enable use cases that require selectively enabling and/or disabling transcription for particular participants or groups of participants.

updateTranscription() is used to update the list of participants being transcribed.

If updateTranscription is called with participants set to null, then the transcription mode switches to default, which is transcribing all participantIds in the room.

Handling transcription messages

Transcription messages are emitted through an event called transcription-message. These events are emitted for all participants when a new transcription snippet is available.

Learn more about transcription-message here.

Enabling domains & rooms for transcription storage

This is an optional step and only required if you want to save the live transcript from your session into a file.

The default for enable_transcription_storage is false. Meaning, with enable_transcription_storage: false, transcripts are generated and broadcasted but not saved.

To save the transcript, set the enable_transcription_storage property to "true" at the room level or at domain level.

Transcription output

The output file will be written in the WebVTT format. This is widely used for A/V text tracks such as subtitles and captions, so there are many tools that work with this format and can transform it into other text formats.

While transcription is active, the output file will be written to storage every two minutes. When transcription ends, the final version of the file will be written. You can query the REST API to get a transcription's status:

If the transcription is finished, the transcript object will contain status: "t_finished"
If the WebVTT file is available, the transcript object will contain isVttAvailable: true

Enabling for a specific room

Enabling at the domain level

Enabling custom buckets to store transcriptions

By default, transcripts are stored in Daily's cloud storage. You can change where your transcripts are stored & managed through the transcription_bucket domain property.

This is similar to setting up your own custom S3 storage for recordings. See the full guide here.

If the transcription is finished, the transcript object will contain status: "t_finished"
If the WebVTT file is available, the transcript object will contain isVttAvailable: true

Enabling a custom bucket for transcription

Real-time transcriptionPay-as-you-go

Real-time transcription
Pay-as-you-go