On 11/08/2021 00:05, daniel.almeida@xxxxxxxxxxxxx wrote: > From: Daniel Almeida <daniel.almeida@xxxxxxxxxxxxx> > > Dear all, > > This patchset adds the stateless AV1 uAPI and the VIVPU virtual driver to > showcase it. > > Note that this patch depends on dynamically allocated control arrays, i.e. [0] > and [1], which are part of the following series[2]. > > This cover letter will discuss the AV1 OBUs and their relationship with the > V4L2 controls proposed therein. The VIVPU test driver will also be discussed. > > Note that I have also written a GStreamer decoder element [3] to interface with > the VIVPU virtual driver through the proposed control interface to ensure that > these three pieces actually work. The MR in gst-plugins-bad is marked as "Draft" > only because the uAPI hasn't been merged yet and there's no real hardware to > test it. > > Padding and holes have not been taken into account yet. > > > > Relevant AV1 Open Bitstream Units (OBUs): > ----------------------------------------- > > AV1 is packetized into a syntax element known as OBU, which stands for Open > Bitstream Units. There are seven different types of OBUs defined in the AV1 > specification, of which five are of interest for the purposes of this API, they > are: > > Sequence Header OBU: Contains information that applies to the entire sequence. > Most importantly, it contains a set of flags that signal which AV1 features are > enabled for the entire video coded sequence. The sequence header OBU also > encodes the sequence profile. > > Frame Header OBU: Contains information that applies to an entire frame. Notably, > this OBU will dictate the frame's dimensions, its frame type, quantization, > segmentation and filter parameters as well as the set of reference frames needed > to effect a decoding operation. A set of flags will signal whether some AV1 > features are enabled for a particular frame. > > Tile Group OBU: Contains tiling information. Tile groups contain the tile data > associated with a frame. Tiles are subdivisions of a picture that can be > independently decoded, optionally in parallel. The entire frame is assembled > from all the tiles after potential loop filtering. > > Frame OBU: Shorthand for a frame header OBU plus a tile group OBU but with less > overhead. Frame OBUs are a convenience for the common case in which a frame > header is combined with tiling information. > > Tile List OBU: Similar to a tile group OBU, but used in "Large Scale > Tile Decoding Mode". The tiling information contained in this OBU has an > additional header that allows the decoder to process a subset of tiles and > display the corresponding part of the image without having to fully decode all > the tiles for a frame. > > > > AV1 uAPI V4L2 CIDs: > ------------------- > > V4L2_CID_STATELESS_AV1_SEQUENCE: represents a Sequence Header OBU. This control > should only be set once per Sequence Header OBU. The "flags" member contains a > bitfield with the set of flags for the current video coded sequence as parsed > from the bitstream. > > V4L2_CID_STATELESS_AV1_FRAME_HEADER: represents a Frame Header OBU. This control > should be set once per frame. > > V4L2_CID_STATELESS_AV1_{TILE_GROUP|TILE_GROUP_ENTRY}: represents a Tile Group > OBU or the tiling information within a Frame OBU. These controls contain an > array of metadata to decode the tiles associated with a frame. Both controls > depend on V4L2_CTRL_FLAG_DYNAMIC_ARRAY and drivers will be able to index into > the array using ctrl->p_cur.p_av1_tile_group and > ctrl->p_cur.p_av1_tile_group_entry as base pointers respectively. Frame OBUs > should be split into their Frame Header OBU and Tile Group OBU constituents > before the array entries can be set and there should be a maximum of 512 tile > group entries as per the AV1 specification. In the event that more than one tile > group is provided, drivers can disambiguate their corresponding entries in the > ctrl->p_cur.p_av1_tile_group_entry array by taking note of the tg_start and > tg_end fields. > > V4L2_CID_STATELESS_AV1_{TILE_LIST|TILE_LIST_ENTRY}: represents a Tile List OBU. > These controls contain an array of metadata to decode a list of tiles associated > with a frame when the decoder is operating under "Large Scale Tile Decoding > Mode". Both controls depend on V4L2_CTRL_FLAG_DYNAMIC_ARRAY, and drivers will be > able index into the array using ctrl->p_cur.p_av1_tile_list and > ctrl->p_cur.p_av1_tile_list_entry as base pointers respectively. In the event > that more than one list is provided, drivers can disambiguate their > corresponding entries in the ctrl->p_cur.p_av1_tile_list_entry array by taking > note of the tile_count_minus_1 field. It's a bit hard to tell for non-AV1 experts how these tile controls relate to one another. This is my understanding: Without tiling only a V4L2_CID_STATELESS_AV1_FRAME_HEADER is needed. With tiling you need a V4L2_CID_STATELESS_AV1_FRAME_HEADER and V4L2_CID_STATELESS_AV1_{TILE_GROUP|TILE_GROUP_ENTRY} arrays. With 'Large Scale Tile Decoding' you need a V4L2_CID_STATELESS_AV1_FRAME_HEADER and V4L2_CID_STATELESS_AV1_{TILE_LIST|TILE_LIST_ENTRY} arrays (I think). It's not clear to me if these TILE_LISTs replace TILE_GROUPs or add to them. If it is the latter, then the V4L2_CID_STATELESS_AV1_{TILE_GROUP|TILE_GROUP_ENTRY} arrays also need to be set for each frame. In any case, this should probably be clarified in the documentation as well. Regards, Hans > > V4L2_CID_STATELESS_AV1_PROFILE: this control lets the driver convey the > supported profiles to userspace. > > V4L2_CID_STATELESS_AV1_LEVEL: this control lets the driver convey the supported > AV1 levels to userspace. > > V4L2_CTRL_AV1_OPERATING_MODE: this control lets the driver convey the supported > operating modes to userspace. Conversely, userspace apps can change the value of > this control to switch between "general decoding" and "large scale tile > decoding". As per the AV1 specification, under *general decoding mode* the > driver should expect the input to be a sequence of OBUs and the output to be a > decoded frame, whereas under *large scale tile decoding mode* the driver should > expect the input to be a tile list OBU plus additional side information and the > output to be a decoded frame. > > > > VIVPU: > ------ > > This virtual driver was written as a way to showcase and test the control > interface for AV1 as well as the GStreamer decoder[3]. This is so we can detect > bugs at an early stage before real hardware is available. VIVPU does not attempt > to decode video at all. > > Once VIVPU is loaded, one can run the following GStreamer pipeline successfully: > > gst-launch-1.0 filesrc location=<path to some sample av1 file> ! parsebin ! v4l2slav1dec ! fakevideosink > > This is provided that the patches in [3] have been applied and the v4l2codecs > gstreamer plugin is compiled. > > It is also possible to print the controls' contents to the console by setting > vivpu_debug to 1. This is handy when debugging, even more so when one is > comparing two different userspace implementations because it makes it easier to > diff the controls that were passed to the kernel. > > [0] https://patchwork.linuxtv.org/project/linux-media/patch/20210610113615.785359-2-hverkuil-cisco@xxxxxxxxx/ > > [1] https://patchwork.linuxtv.org/project/linux-media/patch/20210610113615.785359-3-hverkuil-cisco@xxxxxxxxx/ > > [2] https://patchwork.linuxtv.org/project/linux-media/list/?series=5647 > > [3] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2305 > > Daniel Almeida (2): > media: Add AV1 uAPI > media: vivpu: add virtual VPU driver > > .../userspace-api/media/v4l/biblio.rst | 10 + > .../media/v4l/ext-ctrls-codec-stateless.rst | 1268 +++++++++++++++++ > .../media/v4l/pixfmt-compressed.rst | 21 + > .../media/v4l/vidioc-g-ext-ctrls.rst | 36 + > .../media/v4l/vidioc-queryctrl.rst | 54 + > .../media/videodev2.h.rst.exceptions | 9 + > drivers/media/test-drivers/Kconfig | 1 + > drivers/media/test-drivers/Makefile | 1 + > drivers/media/test-drivers/vivpu/Kconfig | 16 + > drivers/media/test-drivers/vivpu/Makefile | 4 + > drivers/media/test-drivers/vivpu/vivpu-core.c | 418 ++++++ > drivers/media/test-drivers/vivpu/vivpu-dec.c | 491 +++++++ > drivers/media/test-drivers/vivpu/vivpu-dec.h | 61 + > .../media/test-drivers/vivpu/vivpu-video.c | 599 ++++++++ > .../media/test-drivers/vivpu/vivpu-video.h | 46 + > drivers/media/test-drivers/vivpu/vivpu.h | 119 ++ > drivers/media/v4l2-core/v4l2-ctrls-core.c | 286 +++- > drivers/media/v4l2-core/v4l2-ctrls-defs.c | 79 + > drivers/media/v4l2-core/v4l2-ioctl.c | 1 + > include/media/v4l2-ctrls.h | 12 + > include/uapi/linux/v4l2-controls.h | 796 +++++++++++ > include/uapi/linux/videodev2.h | 15 + > 22 files changed, 4342 insertions(+), 1 deletion(-) > create mode 100644 drivers/media/test-drivers/vivpu/Kconfig > create mode 100644 drivers/media/test-drivers/vivpu/Makefile > create mode 100644 drivers/media/test-drivers/vivpu/vivpu-core.c > create mode 100644 drivers/media/test-drivers/vivpu/vivpu-dec.c > create mode 100644 drivers/media/test-drivers/vivpu/vivpu-dec.h > create mode 100644 drivers/media/test-drivers/vivpu/vivpu-video.c > create mode 100644 drivers/media/test-drivers/vivpu/vivpu-video.h > create mode 100644 drivers/media/test-drivers/vivpu/vivpu.h >