Daily VCS Layout API
As was mentioned in the VCS core concepts, one of the goals of VCS is for the engine to be lightweight, easily embeddable, and with clear performance guarantees even when features are combined. We want your compositions to render reliably and without surprises, both on Daily's cloud rendering system, as well as within your client applications.
To this end, VCS doesn't include CSS. While CSS is a rich and powerful standard, it's not compatible with the aforementioned goals. Rendering the full range of various visual properties, effects and animations available in CSS practically requires a GPU (Graphics Processing Unit) which is not available on standard servers. Requiring a GPU would make it very hard to scale VCS across the different targets like media servers and offline post-processing services.
Instead, VCS breaks down CSS's responsibilities into separate properties available on the built-in components. These are grouped into three separate specifications:
In VCS, a "style" strictly means the properties applied when generating the content of an element. Examples of style attributes would be font size or fill color. Styles are applied with the
style prop, which you can find documented for each built-in component.
<Text> component's style prop as an example.
Compositing refers to transformations and blending properties applied at the last stage when rendering the scene. In VCS, it's accessible by applying the
blend props. You can read more about this topic in the reference for the
This is the topic of this section. Layout is applied using the
layout prop to VCS components.
Video layouts are quite different from text documents and interactive application UIs. The layout engines available in CSS were designed for the latter use cases. The VCS layout engine is designed purely for video and presents a functional rather than declarative API.
A fundamental difference in design is that video tends to fill the viewport (i.e. the screen or video content area) by default. Traditional HTML documents are laid out as a flow that expands vertically outside the viewport. This doesn't apply to videos and — indeed — having content flow outside the viewport is generally undesirable.
When designing graphics for video, designers tend to split the available viewport into areas like "lower third" (for title graphics overlays) or "sidebar". The VCS layout engine takes inspiration here and operates on the same principle of splitting and shrinking the available frame. Every layout starts with the full viewport as the frame, and the tree of layout functions will gradually modify the frame as they get called. To create a "lower third", you'd simply write a layout function that returns coordinates where Y is adjusted down by 2/3 and height is reduced to 1/3. Any components nested inside will then receive this as the "parent frame" which they can adjust further.
In CSS, everything is expressed by declaring properties which the engine interprets to produce the final layout. It's powerful and often easy to use, but creates a certain engine bloat. There are hundreds of separate CSS properties, some of them conflicting with each other, and many even come with their own specific mini-languages tucked into string values that must be parsed and executed.
It's not possible for a HTML/CSS author themselves to add new layout models to the engine. There's a fixed set of engines available like "flexbox" and "grid". Since any author can request them at any time, the rendering engine must include everything. A modern web engine is a multi-hundred-megabyte affair.
One way to think of VCS's
layout property is that it's like CSS
display, but you get to decide exactly what it means and which configuration parameters it takes.
The value of the layout property must be an
Array that contains one or two values:
- The layout function (a
- Params passed to the layout function (an
Here we're passing a layout function named
lowerThird and giving it a layout params object containing a property named
pad_gu. (We'll return to this shortly.)
If an element doesn't have the
layout property, it will simply inherit the parent frame, which is the entire viewport by default.
Let's look at a possible implementation of the
lowerThird layout function:
The return value of a layout function must be an object with the properties
x, y, w, h. This is called a "frame", and it describes a rectangle within the viewport in absolute pixel coordinates.
The most important input argument is
parentFrame which is similarly a frame rectangle. The purpose of your function is to transform this rectangle.
For simple layout functions like
lowerThird above, you don't need to think about what the frame units are; it's just slicing down the available frame using a rule contained in the function.
However, for more complex rules, you need to know about the wider context of where the layout is being applied. Let's look at that soon, but first a design detour…
You may have noticed that our previous
lowerThird implementation didn't actually make use of the
pad_gu param we passed in. What is the significance of the "
_gu" suffix here? It refers to the "grid unit", a standard way to express in device-independent measurements in VCS.
The grid unit is a designer-friendly, device-independent unit. The default grid size is 1/36 of the output's minimum dimension. In other words, 1 gu = 20px on a 720p stream (and 30px on a 1080p stream).
The more informal definition is that 1 gu is a good minimum text size for a video stream. It's small but still readable on a TV screen. By defining values in grid units, they're automatically aligned on the default grid, and it's easier to get eye-pleasing results where things align, margins are multiples of a common base size, and measurements are guaranteed to scale correctly to different output sizes.
It's currently not possible to redefine the grid unit size yourself, but this may eventually be added to Daily's streaming/recording API.
Returning to the
lowerThird example above, we can now understand that
pad_gu is supposed to express a padding measured in grid units. Remember that the layout function must return absolute coordinates. It seems like we need to access the grid unit → pixels scale factor somehow.
There is a third argument passed to the layout function. It's an object called "layout context" and provides a
pixelsPerGridUnit property just for this purpose. Let's access the pad_gu param value and scale it. We can now complete the
The layout context offers the following properties:
Number. We saw this used in the above example. It's the scale factor to convert grid units to viewport pixels.
Object. This is a frame rectangle, so it has the properties
yare typically zero, but you should confirm this, rather than assume it.) Accessing the viewport frame is useful if you want to size something based on the viewport size regardless of how deep you're inside the layout tree.
Function. This is a "layout hook", a function that lets you access data and state from the layout system. Calling this function will return the intrinsic size of the current element to which your layout function is applied. For example, if your function is attached to an
useIntrinsicSizewill return the size of the image.
Function. This is a "layout hook", a function that lets you access data and state from the layout system. Calling this function will return the initial computed content size (including children) of the element to which your layout function is applied. "Initial" here means the size that was computed on the first layout pass. For details on how to use this hook, see section "Two-pass layout" below.
It's a common layout requirement that a visual component should adapt and resize based on its children's content. For example, you might want to render a graphic that assumes a default minimum width and height, but is able to grow its height if text content inside the graphic flows to multiple lines.
To enable this kind of dynamic size measurement, VCS does two-pass layout when needed. A layout function implementing a dynamic container can call the
useContentSize layout hook to inform the VCS layout engine that it wants to know the dynamic size of its children. On the first pass,
useContentSize returns a zero size because the child sizes are not yet known. The container layout function then returns a default size which gets passed to the children as usual. If any components in the child tree contain text flows (or otherwise they require more space than what's available in the default size being passed), the VCS layout system will capture that information and use it to compute a final
contentSize value for the container.
On the second layout pass, the container layout function again calls the
useContentSize layout hook. (It's the same layout function being called again by the layout system.) This time
useContentSize will return the final
contentSize value that was computed in the first layout pass. The container can now use this size value to resize and position itself so that the child contents fit snugly. The layout system again proceeds to call the child tree layout functions, and this time they'll receive the parent size that was computed taking into account the
contentSize from the first layout pass.
There is an example composition available in the VCS SDK that demonstrates this principle with a basic "stretch box" that adapts to its text content. Examining the code for this example may be helpful to understand how
useContentSize behaves in two-pass layout.
You can run the example with:
yarn open-browser example:stretchbox
The source is located in the file