Scaling applications to support large calls
Daily defines large meetings as calls with fifty or more participants. The user interface and experience on each call varies, but they all need to route hundreds to thousands of participant media tracks at a time. This guide covers the Daily recommended settings for few-to-many broadcast sessions and interactive large meetings, and shares best practices to scale large calls.
Keynotes, panels, and webinars are examples of few-to-many broadcast sessions. A speaker, or several presenters, addresses many attendees who are only viewing and listening, not sharing video or audio, on the call.
These kinds of sessions can be built on Daily in two ways: through Daily live streaming, or on a Daily call with up to 1,000 participants.
With either of these implementations, Daily APIs support adding features like:
- A "green room" for speakers to test media devices and talk to event production staff
- Interfaces for event production staff to control the visual layout, visible speakers, and external media playback
- Interactive features such as participant hand raising, microphone passing so that a viewer can briefly participate in the broadcast (e.g. ask a question), text chat Q&A, live polling, and transcription
- Recording options, including for individual video and audio tracks for high quality post-production
Daily sessions can be broadcast to multiple live streaming platforms, like AWS IVS or YouTube Live. The layout of the stream can be configured and include up to nine video and 50 audio tracks. Live streaming could be the best option for a large call if hundreds of thousands of people or more could be tuning in, or if providing a viewing link to attendees is preferred to managing silent participants on a call. For more details on Daily live streaming and how to implement it, head to the guide.
Daily supports up to 1,000 participants in a call, with every client receiving real-time (100-200ms of latency) video and audio. At most six cameras and microphones can "broadcast" their audio and video (have their camera and microphone on) in addition to one screen share. All other participants must have cameras and microphones off. This applies to both Daily Prebuilt, the ready-to-use, embeddable video chat interface, or building a completely custom application with the Daily call object.
To set up a few-to-many broadcast session with Daily Prebuilt, set the
owner_only_broadcast room property to
true. This specifies that only a participant who joins the call with a meeting token that identifies them as an owner will be able to share audio or video.
To control who starts a call as a presenter and who starts as a viewer, use the
start_video_off meeting token properties.
false, the default, identifies presenters. Setting these properties to
true designates viewers.
Daily classifies calls with up to fifty participants who all need to engage with each other, keeping cameras and microphones on, as interactive large sessions. Conference breakout rooms or workshops, corporate trainings, and other virtual events all fall under this category.
These kinds of sessions are seeing a lot of evolution and experimentation, especially in social and spatial video. For example, two-dimensional maps with video avatars for each participant, three-dimensional WebGL worlds, and virtual event spaces with topic-specific social "tables" that participants can hop between.
Platforms to support interactive large sessions can be built using either Daily Prebuilt or the Daily call object. Whether embedding the ready-to-use interface or creating a custom implementation, core Daily features like recording, output to live streaming platforms, and transcription are all available.
In interactive large sessions that use Daily Prebuilt, each participant can view either an Active Speaker layout, with the person currently speaking taking up the majority of the screen, or Grid View, a paginated grid layout with a subset of participants visible on screen.
Daily Prebuilt includes built-in features like text chat, a participant list, recording and live streaming.
These calls can support up to 50 participants with all cameras on, with each participant viewing at most six other participants at a time. The next section of the guide covers how to implement those limits with track subscriptions and pagination.
As the number of call participants increases, the number of audio, video, and screen media tracks that need to be routed and then handled by each web client multiplies. This can drain CPU, strain networks to their limits (risking undelivered media streams), and degrade user experience quickly, especially on older or mobile devices.
To optimize performance on calls with hundreds of participants, pagination, track subscriptions, and simulcast layer control can all be implemented using Daily’s APIs. Analyzing call logs can also help improve call quality.
Pagination limits the number of participants displayed on a screen at a time, so a call participant has to click to rotate through all other attendees. This reduces the load on any individual participant’s CPU and bandwidth.
The gif highlights a demo app built on React, but pagination can be implemented regardless of framework, some assembly and state management required.
Use the Daily
participants() method (to reference an object detailing the current meeting participants) and the following pseudocode to add pagination:
Establish constant UI elements like the minimum participant video tile width, default aspect ratio, and the maximum number of tiles per page.
Use the UI constants and the number of call participants to calculate the total number of pages. The number of pages, and which participants are on each page, should update as the number of call participants changes, so set up the state management of your choice.
Add click handlers to the pagination buttons that update the app state to reflect the current page being viewed.
Set up a handler to determine the visible participants on a given page, using the current page, number of participants, and maximum number of tiles per page to make a copy of the participants object that only includes visible participants.
Iterate over the visible participants object to render each participant in the UI.
Daily calls operate on a publish-subscribe model: participants publish audio, video, and screen MediaStreamTracks, and are subscribed to other participant’s tracks.
By default, Daily routes a participant’s distinct set of tracks to all the other participants on a call. In large calls with hundreds and thousands of participants, decoding all that video can demand a lot of network bandwidth and processing power.
Turning off Daily’s default track handling in favor of manual track subscriptions can deliver better call quality during sessions with many participants. Track subscriptions can also enable features like breakout groups, and improve features like pagination.
There are three steps to set up direct track subscriptions:
- Set the
subscribeToTracksAutomaticallycall object property to
This turns off Daily’s default track management. This property can be passed on
createCallObject(), or via the
setSubscribeToTracksAutomatically() method. The latter is a good option if you want to wait to turn on track subscriptions until a certain number of participants have joined the call.
updateParticipants()to change the
subscribedvalue of a participant's
Each individual participant’s
tracks property can be found on the Daily participants object.
tracks for a participant include
video (camera), and
screenVideo (screenshare) properties. Each track type contains: a raw media stream track object available on both
state of the track, and its
state indicates if the track can be played (full possible states in the participants() documentation).
subscribed tells us whether or not the local participant is receiving the track information from the participant who the track belongs to. Its value is
true if the local participant is receiving,
false if they are not, or
subscribed status of
"staged" keeps the connection for that track open, but stops any bytes from flowing across. Compared to a complete connection teardown through an unsubscribe, staging a track speeds up the process of showing or hiding participants' video and audio. Staging also cuts out the processing and bandwidth required for that track.
On calls with many participants, setting the
subscribed value to
false depending on the interface can minimize the load on participants’ connections, improving the user experience. Paginated grid apps (including Daily Prebuilt!) often subscribe to the tracks of participants on a current page, stage those on the previous and next pages, and unsubscribe from the rest.
updateParticipant() receives an object with the participant’s
id as the key, and the value a
setSubscribedTracks object indicating the new
subscribed status of each track type.
updateParticipants(), combine all participants’ updates into one object.
- Listen for participant events to reflect updated participant state in the app interface.
updateParticipant() method fires a corresponding
"participant-updated" event. Add a listener to this event to update the app interface as participants’
subscribed statuses change.
To take advantage of simulcast layers, the call must be happening over an SFU connection. To test over SFU in development, use the
setNetworkTopology() method. This is only for testing. In production, once a fifth participant joins a call — or if recording, transcription, or a live stream is started — an SFU connection will be automatically established.
Once a fifth participant joins a Daily call — or if recording, transcription, or a live stream is started — instead of direct peer-to-peer (P2P) connections, tracks are first sent to a Selective Forwarding Unit (SFU). From there, the SFU processes, re-encrypts, and routes media tracks to participants, allowing tracks to be "selectively" forwarded.
With WebRTC simulcast, instead of sending a single track to the SFU, a publishing participant’s web client sends the same source track at a few different resolutions, bitrates, and frame rates. Each of these groups of settings is known as a simulcast layer.
By default, the Daily client and SFU will work to send the highest layer possible.
There are several factors that affect which layer a participant receives. Those factors include:
- The available bandwidth on the receiving end: The SFU will send a lower layer if the current one exceeds the available bandwidth.
- The send-side is not sending all the layers: This typically occurs when the browser detects network or CPU issues or the highest video resolution of the device does not support the highest layers. It's also worth noting that the browser can and will modify the actual bitrate, frame rate, and resolution sent on each layer for these same reasons, so the actual settings used may not match the configuration.
|Layer 0||Layer 1||Layer 2|
|Frame rate (fps)||10||15||30|
|Resolution (width height)||320x180||640x360||1280x720|
|Layer 0||Layer 1|
|Frame rate (fps)||10||30|
|Resolution (width x height)||320x180||1280x720|
While the default Daily simulcast management suits most use cases, it is possible to adjust the quality of both the video a participant publishes to the server (send-side) and the video that participants receive (receive-side). This can be useful in large calls as a tool to optimize bandwidth, but proceed with caution. Attempting to make changes while calls are in progress or to use values beyond browsers' capacities can cause problems. Please reach out if we can help.
Adjusting the quality of the video that participants send to the server can be useful in large calls like few-to-many style broadcasts. For example, the main active speaker showcased prominently in the UI could send higher quality video than the other call participants.
There are two properties that can be manipulated to adjust send-side video quality:
maxBitrate: the desired video bitrate for the layer, in bits per second.
maxFramerate: an integer value lower than the maximum rate expected from the camera, in frames per second.
scaleResolutionDownBy: an integer power of 2. The horizontal and vertical dimensions of the source video will be divided by this number to get the resolution for the layer.
trackConstraints property, set via the
setBandwidth() method, determines the 'input quality' of the video that a browser gets from a web cam by defining a maximum resolution (
height values) and a
setBandwidth() call needs to happen before a video track is requested from the participant. Disable video on
join() for participants whose bandwidth constraints need to be set manually, then call
setBandwidth(), then enable the camera.
If many participant videos will be displayed or if they will be rendered on small mobile screens, for example, programmatically requesting a lower bitrate layer can improve call performance. Lower quality videos not only save downstream bandwidth, but also can dramatically improve CPU performance since they require fewer resources to render.
To use Daily receiveSettings to request a specific simulcast layer:
Listen for the
"receive-settings-updated"event to update the app interface in response.
See the logs and metrics guide for details on analyzing call quality.