Introduction to the Daily Client SDKs for Android and iOS (beta)

The Daily Client SDKs for Android and iOS allow you to build video and audio calling into your native mobile applications. This guide will help you get started by introducing you to the basics of the libraries, showing you some examples, and pointing you to where you can learn more.

The Daily Client SDKs for Android and iOS are currently in beta. We appreciate any feedback you have related to using them.

In the following guide, some features are marked as Coming soon. They will be in an upcoming (and near!) release.

To view documentation for the Daily Client SDKs, visit our:

Installation

See the installation guides for:

Demo apps

To see a working example of how to interact with the Daily Client SDKs for Android and iOS, see our public demo apps:

Hello, world!

Let’s start with a simple example. The following code snippet shows how you might join a video call, display participants’ camera tracks, and toggle your own camera.

To keep this code snippet concise, it only handles video, it ignores cleaning up participants who leave, and it glosses over adding views/elements to your app.

iOS

Android

Navigating the API

The Daily Client SDKs for Android and iOS are comprised of functionality that can be grouped based on the tasks involved in building a successful video call experience:

We'll review each below with some code examples.


Setting up the call client

The CallClient is your main interface into your Daily video call. It’s an object that you instantiate and hold onto at least for the duration of your call.

Tip: You can also hold onto the CallClient longer and use it for multiple sequential calls.

iOS

Android

In order to write code that dynamically reacts to changes in the call, such as participants joining, leaving, muting, or unmuting, you’ll need to register to receive events from the CallClient. Note that you can register to receive events at any point in the call lifecycle, and that your event listeners will remain on the CallClient even after you leave() a call.

iOS

Android

Managing the call lifecycle

Managing the call lifecycle mostly comes down to deciding when to join() and when to leave() a call, and updating your UI according to the current callState().

When you successfully join() a call, other participants in the same room become visible to you and you become visible to them. “Becoming visible” in this case means gaining an entry in each other’s participants() object, with some basic information filled out. It does not necessarily mean that their media is immediately available to you. That depends on whether or not they’re publishing any tracks and whether or not you’re subscribed to any tracks.

iOS

Android

To show the right UI in your app for the right callState(), you’ll probably want to listen for changes in callState().

iOS

Android

Note that the pool of participants in a call may always be in flux, so you probably don’t just want to look up participants() once (as in the above example) but rather build your app so it reacts to their continual joining, leaving, and changing.

Which brings us to...

Handling participants

Throughout a call, participants may come and go. They may do things like mute and unmute their mic or their camera.

To handle all these changes, you’ll want to build your app so that it reacts appropriately no matter what changed.

iOS

Android

Arguably the most important participant fields — other than id — are the media tracks themselves. Each media track has a state, which you’ll also likely use in addition to the track.

iOS

Android

Note that the local user is represented as a participant too! They get a special entry in participants() with a special isLocal field set to true. The local participant will have media populated by any enabled inputs.

Review our Managing media section below for more information.

iOS

Android

Also note that when you first join() a call, you’ll receive "participant-joined" events for all the remote participants, even if they were in the call before you. This can help you build your UI without needing to write special code paths for the initial join.

Managing media

Managing media is one of the most important tasks in building a video call experience. It can also be one of the more complicated tasks.

There are a few main controls — collections of settings, essentially — that let you dictate how media (say, camera and microphone tracks) gets from the user to other participants, and how other participants’ media gets to the user.

Outbound media controls

  • Inputs: the sender-side settings dictating what pieces of media should be gathered from the user (e.g. a camera track) and how (e.g. at what resolution or frame rate)
  • Publishing: the sender-side settings dictating what pieces of media to send to all participants (e.g. the camera track), and how (e.g. with what encoding settings)

Inbound media controls

  • Subscriptions (Coming soon): the receiver-side settings dictating what pieces of media to receive from each participant, and how (e.g. at what quality, if multiple simulcast layers are available)

Bytes representing a piece of media (e.g. the camera track) only flow from a sender to a receiver if:

  • The sender’s inputs are configured to capture a track from the user’s camera
  • The sender’s publishing settings are configured to send that camera track
  • The receiver’s subscriptions are configured to receive the sender’s camera track

Some notes on the current release

Subscriptions

As of the current release, subscriptions are enabled for all tracks from all participants. This is not yet configurable, but will be in a future release. By letting you restrict who receives whose media, you’ll be able to build apps with larger call sizes as well as different kinds of apps.

Media API fundamentals: What vs. how

A pattern you’ll see repeated throughout the media control APIs is the what vs. how pattern. This makes the media control APIs feel similar to each other.

iOS

Android

Media API fundamentals: Flexibility

Daily APIs were designed with flexibility in mind, letting you update each of the collections of media settings completely independently from one another, and at any point in time.

iOS

Android

Media API fundamentals: Default values and updates

You may have noticed that in the above example we never had to specify an exhaustive set of settings for inputs and publishing in join(). That’s because for everything you don’t specify, Daily falls back to using its own defaults.

iOS

Android

Note that this allows us to keep the most common usage concise.

iOS

Android

To peer under the hood and understand what settings Daily is using, you can always use the inputs() or publishing() getter methods.

iOS

Android

There’s a bit of nuance worth mentioning: when you invoke an update*() method, you provide it an update: a partial settings specification. This is worth emphasizing: update*() doesn’t replace your specified settings, it simply updates — or merges into — your specified settings. The below example shows how this works, and illustrates how we fall back to Daily defaults for anything not in our specified settings.

iOS

Android

Finally, if at any point you want to stop using your specified settings and revert to using Daily’s default settings, you can.

iOS

Android

Inputs and the local participant

As described in Handling participants, there’s a special local participant that represents the local user. The local participant’s media comes directly from inputs — you don’t need to publish to receive your own media through Daily. You don’t even need to have join()ed a call, for that matter.

iOS

Android

Why do it this way, rather than have your local tracks available through inputs()? With all media living in participants(), it’s easier to write media-handling code that’s agnostic to whether it’s coming from a local or remote track.

To try out the Daily Client SDKs for Android and iOS, follow our Android and iOS installation guides. As mentioned, we appreciate any early feedback on these beta versions.