Potential ways to describe virtio-video device capabilities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



I'm working on updating virtio-video draft v8 spec [1] and the virtio-video V4L2 driver [2]. One of the things, that I don't like in the current spec draft is sharing the device's capabilities with the guest VM. The main requirement is making these capabilities compatible with V4L2.

These capabilities could be pretty complex, see [3] and [4]:
1. First there could be several coded video formats. Coded formats have their specific sets of supported controls. 2. Then for each coded video formats there could be different sets of raw video formats, that could be used in combination with the selected encoded format for decoding/encoding. 3. Then for each combination of the coded and raw format there could be different sets of supported resolutions. 4. (Optional) for each coded format, raw format and resolution there could be different sets of supported frame rates/intervals.

In the future new formats, controls, flags, etc could be defined. Right now there is a rather static structure, see section (VIRTIO_VIDEO_CMD_DEVICE_QUERY_CAPS) in [5]. It looks too inflexible.

The V4L2 approach with many different ioctl's and complex querying logic doesn't fit well for virtio-video IMHO. syscall overhead is only a few hundred nanoseconds, so doing tens or hundreds of them is bearable in case of video. But a roundtrip over virtio may take hundreds of microseconds even in the local case. We at OpenSynergy already have setups where the real hardware is accessed over a network. Then it can take a millisecond. People already seem to agree, that this amount of overhead makes V4L2-style discovery process unbearable on a per stream basis. So all the relevant data has to be obtained during the device probing. This means, that in many cases there is a complex structure with all the data on the device side, and we just need to move it to the driver side. Moving it in one step seems easier to me and better because of the latency. Boot time matters too sometimes.

One of the ideas is to replace the mentioned VIRTIO_VIDEO_CMD_DEVICE_QUERY_CAPS command response with a standalone Device Tree Blob. It looks the most promising to me right now. I guess, it may sound crazy to many people, but actually it fits into one of the device tree usage patterns outlined in [6]. This document is referenced in the kernel device tree documentation, so I assume it is correct.

A simplified version could look like this, for example:


/ {
    virtio-video {
        compatible = "virtio,video";

        virtio,video-caps {
            h264 {
                profiles-mask = <0x3ffff>;
                levels-mask = <0xfffff>;
                num-resources-range = <1 32>;
                buffer-size = <0x100000>;
                bitrates-range = <100000 10000000>;

                yuv420 {
                    plane-layout-mask = <0x3>;
                    plane-align = <1>;
                    stride-align-mask = <0x10>;
                    widths-range-stepwise = <96 1920 8>;
                    heights-range-stepwise = <96 1080 1>;
                    num-resources-range = <4 50>;

                nv12 {
                    /* ... */

                rgba {
                    /* ... */

            hevc {
                /* ... */

            vp8 {
                /* ... */

            vp9 {
                /* ... */

Or maybe the resolutions could be defined separately and linked using phandles to avoid duplication because they are going to depend on the raw formats exclusively in most cases, I think.

There are many benefits IMO:

1. Device tree can be used to describe arbitrary trees (and even arbitrary graphs with phandles). The underlying data structure is generic. It looks like it can fit very well here. 2. There is a specification of the format [7]. I hope it could be referenced in the virtio spec, right? 3. There is already DTS, DTC, libfdt and DTB parsing code in Linux. All of this can be reused. For example, at the moment we have a custom configuration file format and a big chunk of code to handle it in our virtio-video device. These could be replaced by DTS and calls to libfdt completely, I think. There is also an implementation in Rust [8]. 4. I think the standalone DTB could be integrated into a board DTB later reducing the amount of queries to zero. It is not going to make a big difference in latency though.

If device trees are used, then we'll need add a binding/schema to the kernel or to the dt-schema repo [9]. Maybe the schema could be referenced in the virtio-video spec instead of duplicating it? This would reduce the spec size.

I don't know if anybody has already done anything like this and I'm not sure if the device tree maintainers and other involved parties would approve it. That's why I'm starting this thread. Please share your opinions about the idea.

An alternative to using device trees would be inventing something similar and simpler (without phandles and strings) from scratch. That's fine too, but doesn't allow to reuse the tooling and also is going to make the virtio-video spec even bigger.

[1] https://lore.kernel.org/virtio-comment/20230629144915.597188-1-Alexander.Gordeev@xxxxxxxxxxxxxxx/ [2] https://lore.kernel.org/linux-media/20200218202753.652093-1-dmitry.sepp@xxxxxxxxxxxxxxx/ [3] https://docs.kernel.org/userspace-api/media/v4l/dev-decoder.html#querying-capabilities [4] https://docs.kernel.org/userspace-api/media/v4l/dev-encoder.html#querying-capabilities
[5] https://drive.google.com/file/d/1uPg4kxThlNPBSiC4b5veyFz4OFGytU7v/view
[6] https://elinux.org/Device_Tree_Usage#Device_Specific_Data
[7] https://www.devicetree.org/specifications/
[8] https://github.com/rust-vmm/vm-fdt
[9] https://github.com/devicetree-org/dt-schema

Kind regards,
Alexander Gordeev

Alexander Gordeev
Senior Software Engineer

OpenSynergy GmbH
Rotherstr. 20, 10245 Berlin

[Index of Archives]     [Linux Virtualization]     [Linux Virtualization]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]     [Monitors]