Daily VCS Layout API

As was mentioned in the VCS core concepts, one of the goals of VCS is for the engine to be lightweight, easily embeddable, and with clear performance guarantees even when features are combined. We want your compositions to render reliably and without surprises, both on Daily's cloud rendering system, as well as within your client applications.

To this end, VCS doesn't include CSS. While CSS is a rich and powerful standard, it's not compatible with the aforementioned goals. Rendering the full range of various visual properties, effects and animations available in CSS practically requires a GPU (Graphics Processing Unit) which is not available on standard servers. Requiring a GPU would make it very hard to scale VCS across the different targets like media servers and offline post-processing services.

VCS's alternate approach to CSS

Instead, VCS breaks down CSS's responsibilities into separate properties available on the built-in components. These are grouped into three separate specifications:

1. Styling

In VCS, a "style" strictly means the properties applied when generating the content of an element. Examples of style attributes would be font size or fill color. Styles are applied with the style prop, which you can find documented for each built-in component.

See the <Text> component's style prop as an example.

2. Compositing

Compositing refers to transformations and blending properties applied at the last stage when rendering the scene. In VCS, it's accessible by applying the transform and blend props. You can read more about this topic in the reference for the <Box> component.

3. Layout

This is the topic of this section. Layout is applied using the layout prop to VCS components.


Background

Video layouts are quite different from text documents and interactive application UIs. The layout engines available in CSS were designed for the latter use cases. The VCS layout engine is designed purely for video and presents a functional rather than declarative API.

A fundamental difference in design is that video tends to fill the viewport (i.e. the screen or video content area) by default. Traditional HTML documents are laid out as a flow that expands vertically outside the viewport. This doesn't apply to videos and — indeed — having content flow outside the viewport is generally undesirable.

When designing graphics for video, designers tend to split the available viewport into areas like "lower third" (for title graphics overlays) or "sidebar". The VCS layout engine takes inspiration here and operates on the same principle of splitting and shrinking the available frame. Every layout starts with the full viewport as the frame, and the tree of layout functions will gradually modify the frame as they get called. To create a "lower third", you'd simply write a layout function that returns coordinates where Y is adjusted down by 2/3 and height is reduced to 1/3. Any components nested inside will then receive this as the "parent frame" which they can adjust further.

Functional vs. declarative

In CSS, everything is expressed by declaring properties which the engine interprets to produce the final layout. It's powerful and often easy to use, but creates a certain engine bloat. There are hundreds of separate CSS properties, some of them conflicting with each other, and many even come with their own specific mini-languages tucked into string values that must be parsed and executed.

It's not possible for a HTML/CSS author themselves to add new layout models to the engine. There's a fixed set of engines available like "flexbox" and "grid". Since any author can request them at any time, the rendering engine must include everything. A modern web engine is a multi-hundred-megabyte affair.

VCS instead uses function-based layout that lets you plug in the exact code you need. It keeps memory footprint down, and more complex layout models can be provided as regular JavaScript libraries loaded as needed. There is no need to wait for the entire system to be updated if your layout engine is missing that one setting!

The layout property

One way to think of VCS's layout property is that it's like CSS display, but you get to decide exactly what it means and which configuration parameters it takes.

The value of the layout property must be an Array that contains one or two values:

  1. The layout function (a Function object). (Mandatory)
  2. Params passed to the layout function (an Object). (Optional)

For example:

Here we're passing a layout function named lowerThird and giving it a layout params object containing a property named pad_gu. (We'll return to this shortly.)

If an element doesn't have the layout property, it will simply inherit the parent frame, which is the entire viewport by default.

Let's look at a possible implementation of the lowerThird layout function:

The return value of a layout function must be an object with the properties x, y, w, h. This is called a "frame", and it describes a rectangle within the viewport in absolute pixel coordinates.

The most important input argument is parentFrame which is similarly a frame rectangle. The purpose of your function is to transform this rectangle.

For simple layout functions like lowerThird above, you don't need to think about what the frame units are; it's just slicing down the available frame using a rule contained in the function.

However, for more complex rules, you need to know about the wider context of where the layout is being applied. Let's look at that soon, but first a design detour…

You may have noticed that our previous lowerThird implementation didn't actually make use of the pad_gu param we passed in. What is the significance of the "_gu" suffix here? It refers to the "grid unit", a standard way to express in device-independent measurements in VCS.

See our moving watermark tutorial for a complete example on how to use the layout property.

The grid unit

The grid unit is a designer-friendly, device-independent unit. The default grid size is 1/36 of the output's minimum dimension. In other words, 1 gu = 20px on a 720p stream (and 30px on a 1080p stream).

The more informal definition is that 1 gu is a good minimum text size for a video stream. It's small but still readable on a TV screen. By defining values in grid units, they're automatically aligned on the default grid, and it's easier to get eye-pleasing results where things align, margins are multiples of a common base size, and measurements are guaranteed to scale correctly to different output sizes.

We recommend specifying design measurements in grid units where possible. For example, the font size for a <Text> component can be passed as grid units using the fontSize_gu style property.

It's currently not possible to redefine the grid unit size yourself, but this may eventually be added to Daily's streaming/recording API.

The layout context object

Returning to the lowerThird example above, we can now understand that pad_gu is supposed to express a padding measured in grid units. Remember that the layout function must return absolute coordinates. It seems like we need to access the grid unit → pixels scale factor somehow.

There is a third argument passed to the layout function. It's an object called "layout context" and provides a pixelsPerGridUnit property just for this purpose. Let's access the pad_gu param value and scale it. We can now complete the lowerThird function:

The layout context offers the following properties:

  • pixelsPerGridUnit: Number. We saw this used in the above example. It's the scale factor to convert grid units to viewport pixels.
  • viewport: Object. This is a frame rectangle, so it has the properties x, y, w, h. (x andy are typically zero, but you should confirm this, rather than assume it.) Accessing the viewport frame is useful if you want to size something based on the viewport size regardless of how deep you're inside the layout tree.
  • useIntrinsicSize: Function. This is a "layout hook", a function that lets you access data and state from the layout system. Calling this function will return the intrinsic size of the current element to which your layout function is applied. For example, if your function is attached to an <Image> component, useIntrinsicSize will return the size of the image.
  • useContentSize: Function. This is a "layout hook", a function that lets you access data and state from the layout system. Calling this function will return the initial computed content size (including children) of the element to which your layout function is applied. "Initial" here means the size that was computed on the first layout pass. For details on how to use this hook, see section "Two-pass layout" below.

Two-pass layout

It's a common layout requirement that a visual component should adapt and resize based on its children's content. For example, you might want to render a graphic that assumes a default minimum width and height, but is able to grow its height if text content inside the graphic flows to multiple lines.

To enable this kind of dynamic size measurement, VCS does two-pass layout when needed. A layout function implementing a dynamic container can call the useContentSize layout hook to inform the VCS layout engine that it wants to know the dynamic size of its children. On the first pass, useContentSize returns a zero size because the child sizes are not yet known. The container layout function then returns a default size which gets passed to the children as usual. If any components in the child tree contain text flows (or otherwise they require more space than what's available in the default size being passed), the VCS layout system will capture that information and use it to compute a final contentSize value for the container.

On the second layout pass, the container layout function again calls the useContentSize layout hook. (It's the same layout function being called again by the layout system.) This time useContentSize will return the final contentSize value that was computed in the first layout pass. The container can now use this size value to resize and position itself so that the child contents fit snugly. The layout system again proceeds to call the child tree layout functions, and this time they'll receive the parent size that was computed taking into account the contentSize from the first layout pass.

There is an example composition available in the VCS SDK that demonstrates this principle with a basic "stretch box" that adapts to its text content. Examining the code for this example may be helpful to understand how useContentSize behaves in two-pass layout.

You can run the example with:

yarn open-browser example:stretchbox

The source is located in the file example/stretchbox.jsx.