Introduction to the Daily Client SDKs for Android and iOS

The Daily Client SDKs for Android and iOS allow you to build video and audio calling into your native mobile applications. This guide will help you get started by introducing you to the basics of the libraries, showing you some examples, and pointing you to where you can learn more.

In the following guide, some features are marked as Coming soon. They will be in an upcoming release.

Documentation

Read our Android and iOS documentation for more information on implementation and navigating our reference docs.

For reference docs for the Daily Client SDKs, visit our:

Installation

See the installation guides for:

Quickstart guides

Want to dive right in? Check out our quickstart guides:

Demo apps

To see a working example of how to interact with the Daily Client SDKs for Android and iOS, see our public demo apps:

Starter kits

We also provide starter kits, built by Daily's engineering team. These apps are built for production use and provide an example of how to build a production app with mobile SDKs:

Hello, world!

Let's start with a simple example. The following code snippet shows how you might join a video call, display participants' camera tracks, and toggle your own camera.

To keep this code snippet concise, it only handles video, it ignores cleaning up participants who leave, and it glosses over adding views/elements to your app.

iOS

Android

Instead of calling release() yourself as above, you can provide a lifecycle parameter to the CallClient constructor. For example, within an Activity you might invoke CallClient(applicationContext, this.lifecycle).

This will release the call client as soon as the activity is destroyed, which would stop your app from running a call in the background. Most call applications probably want to keep running when a user switches apps, so we advise calling release() yourself.

Navigating the API

The Daily Client SDKs for Android and iOS are comprised of functionality that can be grouped based on the tasks involved in building a successful video call experience:

We'll review each below with some code examples.


Setting up the call client

The CallClient is your main interface into your Daily video call. It's an object that you instantiate and hold onto at least for the duration of your call.

Tip: You can also hold onto the CallClient longer and use it for multiple sequential calls.

iOS

Android

In order to write code that dynamically reacts to changes in the call, such as participants joining, leaving, muting, or unmuting, you'll need to register to receive events from the CallClient. Note that you can register to receive events at any point in the call lifecycle, and that your event listeners will remain on the CallClient even after you leave() a call.

iOS

On iOS, you would make a class (such as your UIViewController sub-class) conform to the CallClientDelegate protocol and assign it to the call's delegate property:

Android

On Android, you add listeners to the call:

Managing the call lifecycle

Managing the call lifecycle mostly comes down to deciding when to join() and when to leave() a call, and updating your UI according to the current callState().

When you successfully join() a call, other participants in the same room become visible to you and you become visible to them. “Becoming visible” in this case means gaining an entry in each other's participants() object, with some basic information filled out. It does not necessarily mean that their media is immediately available to you. That depends on whether or not they're publishing any tracks and whether or not you're subscribed to any tracks.

iOS

Android

When using the iOS SDK, you may want to set the isIdleTimerDisabled property on UIApplication to true to keep the device screen enabled during a call. This property should be set back to false when not in a call, so the device screen will go to sleep after the time specified in the system settings. You can use the CallClient.NotificationName.didJoinFirstCall and CallClient.NotificationName.didLeaveLastCall to manage this property as seen below.

iOS

To show the right UI in your app for the right callState, you'll probably want to listen for changes in callState.

iOS

Android

Note that the pool of participants in a call may always be in flux, so you probably don't just want to look up participants() once (as in the above example) but rather build your app so it reacts to their continual joining, leaving, and changing.

Which brings us to...

Handling participants

Throughout a call, participants may come and go. They may do things like mute and unmute their mic or their camera. Additionally, the remote participants may start or stop screen sharing.

To handle all these changes, you'll want to build your app so that it reacts appropriately no matter what changed.

iOS

Android

Arguably the most important participant fields — other than id — are the media tracks themselves. Each media track has a state, which you'll also likely use in addition to the track.

We have the following media tracks:

  • camera: The participant's camera track and track state.
  • microphone: The participant's microphone track and track state.
  • screenAudio: The participant's screen audio track and track state.
  • screenVideo: The participant's screen video track and track state.

iOS

Android

Note that the local user is represented as a participant too! They get a special entry in participants() with a special isLocal field set to true. The local participant will have media populated by any enabled inputs.

Review our Managing media section below for more information.

iOS

Android

Also note that when you first join() a call, you'll receive "participant-joined" events for all the remote participants, even if they were in the call before you. This can help you build your UI without needing to write special code paths for the initial join.

Handling the active speaker

If you are building an app with multiple participants, one important piece of information is the active speaker in the call.

In addition to knowing who the current active speaker is, it's also important to know when the active speaker has changed (i.e. when someone new has started to speak in the call). Let's look at how to update your app to track and react to the active speaker.

iOS

Android

Managing media

Managing media is one of the most important tasks in building a video call experience. It can also be one of the more complicated tasks.

There are a few collections of settings that dictate how media (i.e., camera and microphone tracks) gets from the local user to remote participants, and how remote participants' media gets to the user.

Outbound media

  • Inputs: the sender-side settings dictating what pieces of media should be gathered from the user (e.g. a camera track) and how (e.g. at what resolution or frame rate)
  • Publishing: the sender-side settings dictating what pieces of media to send to all participants (e.g. the camera track), and how (e.g. with what encoding settings)

Inbound media

  • Subscriptions: the receiver-side settings dictating what pieces of media to receive from each participant, and how (e.g. at what quality, if multiple simulcast layers are available). For more about subscription management, see Managing subscriptions.

Bytes representing a piece of media (e.g. the camera track) only flow from a sender to a receiver if:

  • The sender's inputs are configured to capture a track from the user's camera
  • The sender's publishing settings are configured to send that camera track
  • The receiver's subscriptions are configured to receive the sender's camera track

Media API fundamentals: Low-level vs. convenience methods

You can manipulate the above media settings directly through a few low-level/advanced API methods (updateInputs(), updatePublishing(), updateSubscriptions()). Alternatively, you can carry out simple tasks—like muting the microphone—using one of the media control convenience methods (setInput[s]Enabled(), setIsPublishing(), setSubscriptionState(), setSubscriptionProfile()). The following sections present convenience methods side-by-side with their low-level equivalents for illustration.

Media API fundamentals: What vs. how

A pattern you'll see repeated throughout the media control APIs is the what vs. how pattern. This makes the media control APIs feel similar to each other.

iOS

Android

Media API fundamentals: Flexibility

Daily APIs were designed with flexibility in mind, letting you update each of the collections of media settings completely independently from one another, and at any point in time.

iOS

Android

You can also update the user's media settings while they're not in a call. If you enable inputs outside of a call, the user's devices will still turn on, and their media will be available through the local entry in call.participants:

iOS

Android

Media API fundamentals: Default values and updates

You may have noticed that in the above example we never had to specify an exhaustive set of settings for inputs and publishing in join(). That's because for everything you don't specify, Daily falls back to using its own defaults.

iOS

Android

Note that this allows us to keep the most common usage concise.

iOS

Android

To peek under the hood and understand what settings Daily is using, you can always use the getters for inputs or publishing on CallClient.

iOS

Android

There's a bit of nuance worth mentioning: when you invoke an update*() method, you provide it an update: a partial settings specification. This is worth emphasizing: update*() doesn't replace your specified settings, it simply updates — or merges into — your specified settings. The below example shows how this works, and illustrates how we fall back to Daily defaults for anything not in our specified settings.

iOS

Android

Finally, if at any point you want to stop using your specified settings and revert to using Daily's default settings, you can.

iOS

Android

Inputs and the local participant

As described in Handling participants, there's a special local participant that represents the local user. The local participant's media comes directly from inputs — you don't need to publish to receive your own media through Daily. You don't even need to have join()ed a call, for that matter.

iOS

Note: Since Apple does not support screen capture in the main app you need to create a broadcast upload extension for screen sharing. Before invoking the screenVideo feature, as mentioned above, the broadcast extension must already be running. Please review our iOS screen share guide for all the details.

Android

Note: Before invoking the screenVideo feature, as mentioned above, the media projection intent must have already been provided. Please review our Android screen share guide for all the details.

Why do it this way, rather than have your local tracks available through inputs()? With all media living in participants(), it's easier to write media-handling code that's agnostic to whether it's coming from a local or remote track.

VideoView scale modes

There is a property videoScaleMode which you can use to define how the video is supposed to scale:

  • FILL: video frame is scaled to fill the size of the view by maintaining the aspect ratio. Some portion of the video frame may be clipped. This is the default value.
  • FIT: video frame is scaled to fit the size of the view by maintaining the aspect ratio. This will likely make the video borders visible.

iOS

Android

Managing subscriptions

Daily calls operate on a publish-subscribe model: participants publish audio, video, and screen tracks (screen not yet available to publish through the mobile SDKs), and are subscribed to other participants' tracks.

By default, Daily routes a participant’s distinct set of tracks to all the other participants on a call. In large calls, decoding all that video can demand a lot of network bandwidth and processing power.

Changing Daily’s default subscription track handling in favor of your more specific use case can deliver better call quality during sessions with many participants. Track subscriptions make it possible to build features where only subsets of users are viewed at a time, like breakout groups or pages of participants in a grid view.

The base profile is the default profile used by Daily for each new participant who joins a meeting. By changing the base profile, you can change the subscription settings used by new clients who join.

For example, to create an app with audio only, you would need to update the base profile to only subscribe to audio tracks.

iOS

Android

Additionally, you can create new profiles for common scenarios in your application. Doing so allows you to map the different subscription scenarios in a call to specific profiles. Once defined, a user's subscription settings can be changed using the newly-defined profile.

For example, your application may have an active speaker layout, where the active speaker is displayed more prominently than others. Since this is a common state in your application, you can define a new profile, which in this case we'll call activeSpeaker.

iOS

Android

You can check the current state of the subscriptions and subscription profiles at any time.

iOS

Android

Meeting tokens

Meeting tokens provide access to private rooms, and can pass some user-specific properties into the room. To join a private room, provide a valid meeting token to callClient.join(). The value passed in should not have any whitespace, or be wrapped in apostrophes or quotes.

iOS

If you have a text field declared somewhere, similar to the following:

You can access the text value, trim out unwanted characters, build a MeetingToken, and pass it into join() as follows:

Android

If you have an EditText declared similar to the following:

You can use it to build a MeetingToken, which you can then pass to join(), as follows:

If a token provided during join specifies a username or user ID (e.g. an external ID from another application), the relevant participant will have that data in local and remote contexts.

iOS

Android

Async on Android: Using CallClientCoroutineWrapper

The Daily Android SDK includes a CallClientCoroutineWrapper class to provide a suspend-based version of the API. The asynchronous examples throughout this guide have supplied callbacks to methods using the pattern that follows:

Instead, you can make calls on the wrapper from within Kotlin coroutines, suspending execution until the operation is done, throwing exceptions on failure. The following is an example of the previous snippet rewritten to use the suspend style:

To use this pattern, instantiate a CallClient, then pass it into the CallClientCoroutineWrapper constructor, and then use the wrapper as needed: