Introduction to the Daily Client SDKs for Android and iOS
The Daily Client SDKs for Android and iOS allow you to build video and audio calling into your native mobile applications. This guide will help you get started by introducing you to the basics of the libraries, showing you some examples, and pointing you to where you can learn more.
In the following guide, some features are marked as
Coming soon. They will be in an upcoming release.
Read our Android and iOS documentation for more information on implementation and navigating our reference docs.
For reference docs for the Daily Client SDKs, visit our:
See the installation guides for:
To see a working example of how to interact with the Daily Client SDKs for Android and iOS, see our public demo apps:
Let's start with a simple example. The following code snippet shows how you might join a video call, display participants' camera tracks, and toggle your own camera.
To keep this code snippet concise, it only handles video, it ignores cleaning up participants who leave, and it glosses over adding views/elements to your app.
Instead of calling
release() yourself as above, you can provide a
lifecycle parameter to the
CallClient constructor. For example, within an
Activity you might invoke
This will release the call client as soon as the activity is destroyed, which would stop your app from running a call in the background. Most call applications probably want to keep running when a user switches apps, so we advise calling
Navigating the API
The Daily Client SDKs for Android and iOS are comprised of functionality that can be grouped based on the tasks involved in building a successful video call experience:
- Setting up the call client
- Managing the call lifecycle
- Handling participants
- Handling the active speaker
- Managing media
- Managing subscriptions
We'll review each below with some code examples.
Setting up the call client
CallClient is your main interface into your Daily video call. It's an object that you instantiate and hold onto at least for the duration of your call.
Tip: You can also hold onto the
CallClient longer and use it for multiple sequential calls.
In order to write code that dynamically reacts to changes in the call, such as participants joining, leaving, muting, or unmuting, you'll need to register to receive events from the
CallClient. Note that you can register to receive events at any point in the call lifecycle, and that your event listeners will remain on the
CallClient even after you
leave() a call.
On iOS, you would make a class (such as your
UIViewController sub-class) conform to the
CallClientDelegate protocol and assign it to the call's
On Android, you add listeners to the call:
Managing the call lifecycle
Managing the call lifecycle mostly comes down to deciding when to
join() and when to
leave() a call, and updating your UI according to the current
When you successfully
join() a call, other participants in the same room become visible to you and you become visible to them. “Becoming visible” in this case means gaining an entry in each other's
participants() object, with some basic information filled out. It does not necessarily mean that their media is immediately available to you. That depends on whether or not they're publishing any tracks and whether or not you're subscribed to any tracks.
To show the right UI in your app for the right
callState, you'll probably want to listen for changes in
Note that the pool of participants in a call may always be in flux, so you probably don't just want to look up
participants() once (as in the above example) but rather build your app so it reacts to their continual joining, leaving, and changing.
Which brings us to...
Throughout a call, participants may come and go. They may do things like mute and unmute their mic or their camera. Additionally, the remote participants may start or stop screen sharing.
To handle all these changes, you'll want to build your app so that it reacts appropriately no matter what changed.
Arguably the most important participant fields — other than
id — are the media tracks themselves. Each media track has a
state, which you'll also likely use in addition to the track.
We have the following media tracks:
camera: The participant's camera track and track state.
microphone: The participant's microphone track and track state.
screenAudio: The participant's screen audio track and track state.
screenVideo: The participant's screen video track and track state.
Note that the local user is represented as a participant too! They get a special entry in
participants() with a special
isLocal field set to
true. The local participant will have
media populated by any enabled inputs.
Also note that when you first
join() a call, you'll receive
"participant-joined" events for all the remote participants, even if they were in the call before you. This can help you build your UI without needing to write special code paths for the initial join.
Handling the active speaker
If you are building an app with multiple participants, one important piece of information is the active speaker in the call.
In addition to knowing who the current active speaker is, it's also important to know when the active speaker has changed (i.e. when someone new has started to speak in the call). Let's look at how to update your app to track and react to the active speaker.
Managing media is one of the most important tasks in building a video call experience. It can also be one of the more complicated tasks.
There are a few collections of settings that dictate how media (i.e., camera and microphone tracks) gets from the local user to remote participants, and how remote participants' media gets to the user.
- Inputs: the sender-side settings dictating what pieces of media should be gathered from the user (e.g. a camera track) and how (e.g. at what resolution or frame rate)
- Publishing: the sender-side settings dictating what pieces of media to send to all participants (e.g. the camera track), and how (e.g. with what encoding settings)
- Subscriptions: the receiver-side settings dictating what pieces of media to receive from each participant, and how (e.g. at what quality, if multiple simulcast layers are available). For more about subscription management, see Managing subscriptions.
Bytes representing a piece of media (e.g. the camera track) only flow from a sender to a receiver if:
- The sender's inputs are configured to capture a track from the user's camera
- The sender's publishing settings are configured to send that camera track
- The receiver's subscriptions are configured to receive the sender's camera track
Media API fundamentals: Low-level vs. convenience methods
You can manipulate the above media settings directly through a few low-level/advanced API methods (
updateSubscriptions()). Alternatively, you can carry out simple tasks—like muting the microphone—using one of the media control convenience methods (
setSubscriptionProfile()). The following sections present convenience methods side-by-side with their low-level equivalents for illustration.
Media API fundamentals: What vs. how
A pattern you'll see repeated throughout the media control APIs is the what vs. how pattern. This makes the media control APIs feel similar to each other.
Media API fundamentals: Flexibility
Daily APIs were designed with flexibility in mind, letting you update each of the collections of media settings completely independently from one another, and at any point in time.
You can also update the user's media settings while they're not in a call. If you enable inputs outside of a call, the user's devices will still turn on, and their media will be available through the local entry in
Media API fundamentals: Default values and updates
You may have noticed that in the above example we never had to specify an exhaustive set of settings for
join(). That's because for everything you don't specify, Daily falls back to using its own defaults.
Note that this allows us to keep the most common usage concise.
To peek under the hood and understand what settings Daily is using, you can always use the getters for
There's a bit of nuance worth mentioning: when you invoke an
update*() method, you provide it an update: a partial settings specification. This is worth emphasizing:
update*() doesn't replace your specified settings, it simply updates — or merges into — your specified settings. The below example shows how this works, and illustrates how we fall back to Daily defaults for anything not in our specified settings.
Finally, if at any point you want to stop using your specified settings and revert to using Daily's default settings, you can.
Inputs and the local participant
As described in Handling participants, there's a special
local participant that represents the local user. The
local participant's media comes directly from inputs — you don't need to publish to receive your own media through Daily. You don't even need to have
join()ed a call, for that matter.
Why do it this way, rather than have your local tracks available through
inputs()? With all media living in
participants(), it's easier to write media-handling code that's agnostic to whether it's coming from a local or remote track.
VideoView scale modes
There is a property
videoScaleMode which you can use to define how the video is supposed to scale:
FILL: video frame is scaled to fill the size of the view by maintaining the aspect ratio. Some portion of the video frame may be clipped. This is the default value.
FIT: video frame is scaled to fit the size of the view by maintaining the aspect ratio. This will likely make the video borders visible.
Daily calls operate on a publish-subscribe model: participants publish audio, video, and screen tracks (screen not yet available to publish through the mobile SDKs), and are subscribed to other participants' tracks.
By default, Daily routes a participant’s distinct set of tracks to all the other participants on a call. In large calls, decoding all that video can demand a lot of network bandwidth and processing power.
Changing Daily’s default subscription track handling in favor of your more specific use case can deliver better call quality during sessions with many participants. Track subscriptions make it possible to build features where only subsets of users are viewed at a time, like breakout groups or pages of participants in a grid view.
base profile is the default profile used by Daily for each new participant who joins a meeting. By changing the
base profile, you can change the subscription settings used by new clients who join.
For example, to create an app with audio only, you would need to update the
base profile to only subscribe to audio tracks.
Additionally, you can create new profiles for common scenarios in your application. Doing so allows you to map the different subscription scenarios in a call to specific profiles. Once defined, a user's subscription settings can be changed using the newly-defined profile.
For example, your application may have an active speaker layout, where the active speaker is displayed more prominently than others. Since this is a common state in your application, you can define a new profile, which in this case we'll call
You can check the current state of the subscriptions and subscription profiles at any time.
Meeting tokens provide access to private rooms, and can pass some user-specific properties into the room. To join a private room, provide a valid meeting token to
callClient.join(). The value passed in should not have any whitespace, or be wrapped in apostrophes or quotes.
If you have a text field declared somewhere, similar to the following:
You can access the text value, trim out unwanted characters, build a
MeetingToken, and pass it into
join() as follows:
If you have an
EditText declared similar to the following:
You can use it to build a
MeetingToken, which you can then pass to
join(), as follows:
If a token provided during join specifies a username or user ID (e.g. an external ID from another application), the relevant participant will have that data in local and remote contexts.
Async on Android: Using
The Daily Android SDK includes a
CallClientCoroutineWrapper class to provide a suspend-based version of the API. The asynchronous examples throughout this guide have supplied callbacks to methods using the pattern that follows:
Instead, you can make calls on the wrapper from within Kotlin coroutines, suspending execution until the operation is done, throwing exceptions on failure. The following is an example of the previous snippet rewritten to use the
To use this pattern, instantiate a
CallClient, then pass it into the
CallClientCoroutineWrapper constructor, and then use the wrapper as needed: