Hi Hans, thanks for the comments! On Mon, Sep 10, 2018 at 8:25 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote: > > Hi Alexandre, > > Thank you very much for working on this, much appreciated! It's still quite rough around the edges (and everywhere else actually), but that will do to get things started... > > On 08/31/2018 09:47 AM, Alexandre Courbot wrote: > > This patch documents the protocol that user-space should follow when > > communicating with stateless video decoders. It is based on the > > following references: > > > > * The current protocol used by Chromium (converted from config store to > > request API) > > > > * The submitted Cedrus VPU driver > > > > As such, some things may not be entirely consistent with the current > > state of drivers, so it would be great if all stakeholders could point > > out these inconsistencies. :) > > > > This patch is supposed to be applied on top of the Request API V18 as > > well as the memory-to-memory video decoder interface series by Tomasz > > Figa. > > > > It should be considered an early RFC. > > > > Signed-off-by: Alexandre Courbot <acourbot@xxxxxxxxxxxx> > > --- > > .../media/uapi/v4l/dev-stateless-decoder.rst | 413 ++++++++++++++++++ > > Documentation/media/uapi/v4l/devices.rst | 1 + > > .../media/uapi/v4l/extended-controls.rst | 23 + > > 3 files changed, 437 insertions(+) > > create mode 100644 Documentation/media/uapi/v4l/dev-stateless-decoder.rst > > > > diff --git a/Documentation/media/uapi/v4l/dev-stateless-decoder.rst b/Documentation/media/uapi/v4l/dev-stateless-decoder.rst > > new file mode 100644 > > index 000000000000..bf7b13a8ee16 > > --- /dev/null > > +++ b/Documentation/media/uapi/v4l/dev-stateless-decoder.rst > > @@ -0,0 +1,413 @@ > > +.. -*- coding: utf-8; mode: rst -*- > > + > > +.. _stateless_decoder: > > + > > +************************************************** > > +Memory-to-memory Stateless Video Decoder Interface > > +************************************************** > > + > > +A stateless decoder is a decoder that works without retaining any kind of state > > +between processing frames. This means that each frame is decoded independently > > +of any previous and future frames, and that the client is responsible for > > +maintaining the decoding state and providing it to the driver. This is in > > +contrast to the stateful video decoder interface, where the hardware maintains > > +the decoding state and all the client has to do is to provide the raw encoded > > +stream. > > + > > +This section describes how user-space ("the client") is expected to communicate > > +with such decoders in order to successfully decode an encoded stream. Compared > > +to stateful codecs, the driver/client protocol is simpler, but cost of this > > +simplicity is extra complexity in the client which must maintain the decoding > > +state. > > + > > +Querying capabilities > > +===================== > > + > > +1. To enumerate the set of coded formats supported by the driver, the client > > + calls :c:func:`VIDIOC_ENUM_FMT` on the ``OUTPUT`` queue. > > + > > + * The driver must always return the full set of supported formats for the > > + currently set ``OUTPUT`` format, irrespective of the format currently set > > + on the ``CAPTURE`` queue. > > + > > +2. To enumerate the set of supported raw formats, the client calls > > + :c:func:`VIDIOC_ENUM_FMT` on the ``CAPTURE`` queue. > > + > > + * The driver must return only the formats supported for the format currently > > + active on the ``OUTPUT`` queue. > > + > > + * In order to enumerate raw formats supported by a given coded format, the > > + client must thus set that coded format on the ``OUTPUT`` queue first, and > > + then enumerate the ``CAPTURE`` queue. > > + > > +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported > > + resolutions for a given format, passing desired pixel format in > > + :c:type:`v4l2_frmsizeenum` ``pixel_format``. > > + > > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT`` queue > > + must include all possible coded resolutions supported by the decoder > > + for given coded pixel format. > > + > > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE`` queue > > + must include all possible frame buffer resolutions supported by the > > + decoder for given raw pixel format and coded format currently set on > > + ``OUTPUT`` queue. > > + > > + .. note:: > > + > > + The client may derive the supported resolution range for a > > + combination of coded and raw format by setting width and height of > > + ``OUTPUT`` format to 0 and calculating the intersection of > > + resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES` > > + for the given coded and raw formats. > > + > > +4. Supported profiles and levels for given format, if applicable, may be > > + queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`. > > + > > +Initialization > > +============== > > + > > +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See > > + capability enumeration. > > + > > +2. Set the coded format on the ``OUTPUT`` queue via :c:func:`VIDIOC_S_FMT` > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + ``pixelformat`` > > + a coded pixel format > > + > > + ``width``, ``height`` > > + parsed width and height of the coded format > > + > > + other fields > > + follow standard semantics > > + > > + .. note:: > > + > > + Changing ``OUTPUT`` format may change currently set ``CAPTURE`` > > + format. The driver will derive a new ``CAPTURE`` format from > > + ``OUTPUT`` format being set, including resolution, colorimetry > > + parameters, etc. If the client needs a specific ``CAPTURE`` format, > > + it must adjust it afterwards. > > + > > +3. *[optional]* Get minimum number of buffers required for ``OUTPUT`` > > + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to > > + use more buffers than minimum required by hardware/format. > > + > > + * **Required fields:** > > + > > + ``id`` > > + set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT`` > > + > > + * **Return fields:** > > + > > + ``value`` > > + required number of ``OUTPUT`` buffers for the currently set > > + format > > + > > +4. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the > > + destination buffers parsed/decoded from the bitstream. > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + * **Return fields:** > > + > > + ``width``, ``height`` > > + frame buffer resolution for the decoded frames > > + > > + ``pixelformat`` > > + pixel format for decoded frames > > + > > + ``num_planes`` (for _MPLANE ``type`` only) > > + number of planes for pixelformat > > + > > + ``sizeimage``, ``bytesperline`` > > + as per standard semantics; matching frame buffer format > > + > > + .. note:: > > + > > + The value of ``pixelformat`` may be any pixel format supported for the > > + ``OUTPUT`` format, based on the hardware capabilities. It is suggested > > + that driver chooses the preferred/optimal format for given configuration. > > + For example, a YUV format may be preferred over an RGB format, if > > + additional conversion step would be required. > > + > > +5. *[optional]* Enumerate ``CAPTURE`` formats via :c:func:`VIDIOC_ENUM_FMT` on > > + ``CAPTURE`` queue. The client may use this ioctl to discover which > > + alternative raw formats are supported for the current ``OUTPUT`` format and > > + select one of them via :c:func:`VIDIOC_S_FMT`. > > + > > + .. note:: > > + > > + The driver will return only formats supported for the currently selected > > + ``OUTPUT`` format, even if more formats may be supported by the driver in > > + general. > > + > > + For example, a driver/hardware may support YUV and RGB formats for > > + resolutions 1920x1088 and lower, but only YUV for higher resolutions (due > > + to hardware limitations). After setting a resolution of 1920x1088 or lower > > + as the ``OUTPUT`` format, :c:func:`VIDIOC_ENUM_FMT` may return a set of > > + YUV and RGB pixel formats, but after setting a resolution higher than > > + 1920x1088, the driver will not return RGB, unsupported for this > > + resolution. > > + > > +6. *[optional]* Choose a different ``CAPTURE`` format than suggested via > > + :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the client to > > + choose a different format than selected/suggested by the driver in > > + :c:func:`VIDIOC_G_FMT`. > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + ``pixelformat`` > > + a raw pixel format > > + > > + .. note:: > > + > > + Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available > > + formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to find > > + out a set of allowed formats for given configuration, but not required, > > + if the client can accept the defaults. > > + > > +7. *[optional]* Get minimum number of buffers required for ``CAPTURE`` queue via > > + :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to use more buffers > > + than minimum required by hardware/format. > > + > > + * **Required fields:** > > + > > + ``id`` > > + set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE`` > > + > > + * **Return fields:** > > + > > + ``value`` > > + minimum number of buffers required to decode the stream parsed in this > > + initialization sequence. > > + > > + .. note:: > > + > > + Note that the minimum number of buffers must be at least the number > > + required to successfully decode the current stream. This may for example > > + be the required DPB size for an H.264 stream given the parsed stream > > + configuration (resolution, level). > > + > > +8. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on > > + ``OUTPUT`` queue. > > + > > + * **Required fields:** > > + > > + ``count`` > > + requested number of buffers to allocate; greater than zero > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + ``memory`` > > + follows standard semantics > > + > > + ``sizeimage`` > > + follows standard semantics; the client is free to choose any > > + suitable size, however, it may be subject to change by the > > + driver > > + > > + * **Return fields:** > > + > > + ``count`` > > + actual number of buffers allocated > > + > > + * The driver must adjust count to minimum of required number of ``OUTPUT`` > > + buffers for given format and count passed. The client must check this > > + value after the ioctl returns to get the number of buffers allocated. > > + > > + .. note:: > > + > > + To allocate more than minimum number of buffers (for pipeline depth), use > > + G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to get minimum number of > > + buffers required by the driver/format, and pass the obtained value plus > > + the number of additional buffers needed in count to > > + :c:func:`VIDIOC_REQBUFS`. > > + > > +9. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` on the > > + ``CAPTURE`` queue. > > + > > + * **Required fields:** > > + > > + ``count`` > > + requested number of buffers to allocate; greater than zero > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + ``memory`` > > + follows standard semantics > > + > > + * **Return fields:** > > + > > + ``count`` > > + adjusted to allocated number of buffers > > + > > + * The driver must adjust count to minimum of required number of > > + destination buffers for given format and stream configuration and the > > + count passed. The client must check this value after the ioctl > > + returns to get the number of buffers allocated. > > + > > + .. note:: > > + > > + To allocate more than minimum number of buffers (for pipeline > > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to > > + get minimum number of buffers required, and pass the obtained value > > + plus the number of additional buffers needed in count to > > + :c:func:`VIDIOC_REQBUFS`. > > + > > +10. Allocate requests (likely one per ``OUTPUT`` buffer) via > > + :c:func:`MEDIA_IOC_REQUEST_ALLOC` on the media device. > > + > > +11. Start streaming on both ``OUTPUT`` and ``CAPTURE`` queues via > > + :c:func:`VIDIOC_STREAMON`. > > If I am not mistaken, steps 1-11 are the same as for stateful codecs. It is better > to just refer to that instead of copying them. They are similar but not exactly identical, particularly after Tomasz' comments. It may be difficult to avoid some duplication here. > > > + > > +Decoding > > +======== > > + > > +For each frame, the client is responsible for submitting a request to which the > > +following is attached: > > + > > +* Exactly one frame worth of encoded data in a buffer submitted to the > > + ``OUTPUT`` queue, > > +* All the controls relevant to the format being decoded (see below for details). > > + > > +``CAPTURE`` buffers must not be part of the request, but must be queued > > +independently. The driver will pick one of the queued ``CAPTURE`` buffers and > > +decode the frame into it. Although the client has no control over which > > +``CAPTURE`` buffer will be used with a given ``OUTPUT`` buffer, it is guaranteed > > +that ``CAPTURE`` buffers will be returned in decode order (i.e. the same order > > +as ``OUTPUT`` buffers were submitted), so it is trivial to associate a dequeued > > +``CAPTURE`` buffer to its originating request and ``OUTPUT`` buffer. > > + > > +If the request is submitted without an ``OUTPUT`` buffer or if one of the > > +required controls are missing, then :c:func:`MEDIA_REQUEST_IOC_QUEUE` will return > > +``-EINVAL``. > > Not entirely true: if buffers are missing, then ENOENT is returned. Missing required > controls or more than one OUTPUT buffer will result in EINVAL. This per the latest > Request API changes. Fixed that, thanks. > > Decoding errors are signaled by the ``CAPTURE`` buffers being > > +dequeued carrying the ``V4L2_BUF_FLAG_ERROR`` flag. > > Add here that if the reference frame had an error, then all other frames that refer > to it should also set the ERROR flag. It is up to userspace to decide whether or > not to drop them (part of the frame might still be valid). Noted and added. > > I am not sure whether this should be documented, but there are some additional > restrictions w.r.t. reference frames: > > Since decoders need access to the decoded reference frames there are some corner > cases that need to be checked: > > 1) V4L2_MEMORY_USERPTR cannot be used for the capture queue: the driver does not > know when a malloced but dequeued buffer is freed, so the reference frame > could suddenly be gone. It it is confirmed that we cannot use USERPTR buffers as CAPTURE then this probably needs to be documented. I wonder if there isn't a way to avoid this by having vb2 keep a reference to the pages in such a way that they would not be recycled after after userspace calls free() on the buffer. Is that possible with user-allocated memory? Not that I think that forbidding USERPTR buffers in this scenario would be a big problem. > > 2) V4L2_MEMORY_DMABUF can be used, but drivers should check that the dma buffer is > still available AND increase the dmabuf refcount while it is used by the HW. Yeah, with DMABUF the above problem can easily be avoided at least. > > 3) What to do if userspace has requeued a buffer containing a reference frame, > and you want to decode a B/P-frame that refers to that buffer? We need to > check against that: I think that when you queue a capture buffer whose index > is used in a pending request as a reference frame, than that should fail with > an error. And trying to queue a request referring to a buffer that has been > requeued should also fail. > > We might need to add some support for this in v4l2-mem2mem.c or vb2. Sounds good, and we should document this as well.