On 01/23/2019 06:27 AM, Tomasz Figa wrote: > On Tue, Jan 22, 2019 at 11:47 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote: >> >> On 01/22/19 11:02, Tomasz Figa wrote: >>> On Mon, Nov 12, 2018 at 8:37 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote: >>>> >>>> Hi Tomasz, >>>> >>>> A general note for the stateful and stateless patches: they describe specific >>>> use-cases of the more generic Codec Interface, and as such should be one >>>> level deeper in the section hierarchy. >>> >>> I wonder what exactly this Codec Interface is. Is it a historical name >>> for mem2mem? If so, perhaps it would make sense to rename it? >> >> Yeah, it should be renamed to "Video Memory-to-Memory Interface", and the >> codecs are just specific instances of such an interface. >> > > Ack. > >>> >>>> >>>> I.e. instead of being section 4.6/7/8: >>>> >>>> https://hverkuil.home.xs4all.nl/request-api/uapi/v4l/devices.html >>>> >>>> they should be 4.5.1/2/3. >>>> >>> >>> FYI, the first RFC started like that, but it only made the spec >>> difficult to navigate and the section numbers too long. >>> >>> Still, no strong opinion. I'm okay moving it there, if you think it's better. >> >> It should be moved and the interface name should be renamed. It makes a lot >> more sense with those changes. >> >> I've posted a patch for this. >> > > Thanks. I've rebased on top of it. > > [snip] >>>>> +3. Query the minimum number of buffers required for the ``CAPTURE`` queue via >>>>> + :c:func:`VIDIOC_G_CTRL`. This is useful if the client intends to use more >>>>> + buffers than the minimum required by hardware/format. >>>> >>>> Is this step optional or required? Can it change when a resolution change occurs? >>> >>> Probably not with a simple resolution change, but a case when a stream >>> is changed on the fly would trigger what we call "resolution change" >>> here, but what would effectively be a "source change" and that could >>> include a change in the number of required CAPTURE buffers. >>> >>>> How does this relate to the checks for the minimum number of buffers that REQBUFS >>>> does? >>> >>> The control returns the minimum that REQBUFS would allow, so the >>> application can add few more buffers on top of that and improve the >>> pipelining. >>> >>>> >>>> The 'This is useful if' sentence suggests that it is optional, but I think that >>>> sentence just confuses the issue. >>>> >>> >>> It used to be optional and I didn't rephrase it after turning it into >>> mandatory. How about: >>> >>> This enables the client to request more buffers >>> than the minimum required by hardware/format and achieve better pipelining. >> >> Hmm, OK. It'll do, I guess. I never liked these MIN_BUFFERS controls, I wish they >> would return something like the recommended number of buffers that will give you >> decent performance. >> > > The problem here is that the kernel doesn't know what is decent for > the application, since it doesn't know how the results of the decoding > are used. Over-allocating would result to a waste of memory, which > could then make it less than decent for memory-constrained > applications. > >>> >>>>> + >>>>> + * **Required fields:** >>>>> + >>>>> + ``id`` >>>>> + set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE`` >>>>> + >>>>> + * **Return fields:** >>>>> + >>>>> + ``value`` >>>>> + minimum number of buffers required to decode the stream parsed in >>>>> + this initialization sequence. >>>>> + >>>>> + .. note:: >>>>> + >>>>> + The minimum number of buffers must be at least the number required to >>>>> + successfully decode the current stream. This may for example be the >>>>> + required DPB size for an H.264 stream given the parsed stream >>>>> + configuration (resolution, level). >>>>> + >>>>> + .. warning:: >>>>> + >>>>> + The value is guaranteed to be meaningful only after the decoder >>>>> + successfully parses the stream metadata. The client must not rely on the >>>>> + query before that happens. >>>>> + >>>>> +4. **Optional.** Enumerate ``CAPTURE`` formats via :c:func:`VIDIOC_ENUM_FMT` on >>>>> + the ``CAPTURE`` queue. Once the stream information is parsed and known, the >>>>> + client may use this ioctl to discover which raw formats are supported for >>>>> + given stream and select one of them via :c:func:`VIDIOC_S_FMT`. >>>> >>>> Can the list returned here differ from the list returned in the 'Querying capabilities' >>>> step? If so, then I assume it will always be a subset of what was returned in >>>> the 'Querying' step? >>> >>> Depends on whether you're considering just VIDIOC_ENUM_FMT or also >>> VIDIOC_G_FMT and VIDIOC_ENUM_FRAMESIZES. >>> >>> The initial VIDIOC_ENUM_FMT has no way to account for any resolution >>> constraints of the formats, so the list would include all raw pixel >>> formats that the decoder can handle with selected coded pixel format. >>> However, the list can be further narrowed down by using >>> VIDIOC_ENUM_FRAMESIZES, to restrict each raw format only to the >>> resolutions it can handle. >>> >>> The VIDIOC_ENUM_FMT call in this sequence (after getting the stream >>> information) would have the knowledge about the resolution, so the >>> list returned here would only include the formats that can be actually >>> handled. It should match the result of the initial query using both >>> VIDIOC_ENUM_FMT and VIDIOC_ENUM_FRAMESIZES. >> >> Right, so this will be a subset of the initial query taking the resolution >> into account. >> > > Do you think it could be worth adding a note about this? Yes, it helps clarify things. > > [snip] >>>>> +Decoding >>>>> +======== >>>>> + >>>>> +This state is reached after the `Capture setup` sequence finishes succesfully. >>>>> +In this state, the client queues and dequeues buffers to both queues via >>>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following the standard >>>>> +semantics. >>>>> + >>>>> +The contents of the source ``OUTPUT`` buffers depend on the active coded pixel >>>>> +format and may be affected by codec-specific extended controls, as stated in >>>>> +the documentation of each format. >>>>> + >>>>> +Both queues operate independently, following the standard behavior of V4L2 >>>>> +buffer queues and memory-to-memory devices. In addition, the order of decoded >>>>> +frames dequeued from the ``CAPTURE`` queue may differ from the order of queuing >>>>> +coded frames to the ``OUTPUT`` queue, due to properties of the selected coded >>>>> +format, e.g. frame reordering. >>>>> + >>>>> +The client must not assume any direct relationship between ``CAPTURE`` >>>>> +and ``OUTPUT`` buffers and any specific timing of buffers becoming >>>>> +available to dequeue. Specifically, >>>>> + >>>>> +* a buffer queued to ``OUTPUT`` may result in no buffers being produced >>>>> + on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only >>>>> + metadata syntax structures are present in it), >>>>> + >>>>> +* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced >>>>> + on ``CAPTURE`` (if the encoded data contained more than one frame, or if >>>>> + returning a decoded frame allowed the decoder to return a frame that >>>>> + preceded it in decode, but succeeded it in the display order), >>>>> + >>>>> +* a buffer queued to ``OUTPUT`` may result in a buffer being produced on >>>>> + ``CAPTURE`` later into decode process, and/or after processing further >>>>> + ``OUTPUT`` buffers, or be returned out of order, e.g. if display >>>>> + reordering is used, >>>>> + >>>>> +* buffers may become available on the ``CAPTURE`` queue without additional >>>>> + buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of the >>>>> + ``OUTPUT`` buffers queued in the past whose decoding results are only >>>>> + available at later time, due to specifics of the decoding process. >>>>> + >>>>> +.. note:: >>>>> + >>>>> + To allow matching decoded ``CAPTURE`` buffers with ``OUTPUT`` buffers they >>>>> + originated from, the client can set the ``timestamp`` field of the >>>>> + :c:type:`v4l2_buffer` struct when queuing an ``OUTPUT`` buffer. The >>>>> + ``CAPTURE`` buffer(s), which resulted from decoding that ``OUTPUT`` buffer >>>>> + will have their ``timestamp`` field set to the same value when dequeued. >>>>> + >>>>> + In addition to the straighforward case of one ``OUTPUT`` buffer producing >> >> straighforward -> straightforward >> > > Ack. > >>>>> + one ``CAPTURE`` buffer, the following cases are defined: >>>>> + >>>>> + * one ``OUTPUT`` buffer generates multiple ``CAPTURE`` buffers: the same >>>>> + ``OUTPUT`` timestamp will be copied to multiple ``CAPTURE`` buffers, >>>>> + >>>>> + * multiple ``OUTPUT`` buffers generate one ``CAPTURE`` buffer: timestamp of >>>>> + the ``OUTPUT`` buffer queued last will be copied, >>>>> + >>>>> + * the decoding order differs from the display order (i.e. the >>>>> + ``CAPTURE`` buffers are out-of-order compared to the ``OUTPUT`` buffers): >>>>> + ``CAPTURE`` timestamps will not retain the order of ``OUTPUT`` timestamps >>>>> + and thus monotonicity of the timestamps cannot be guaranteed. >> >> I think this last point should be rewritten. The timestamp is just a value that >> is copied, there are no monotonicity requirements for m2m devices in general. >> > > Actually I just realized the last point might not even be achievable > for some of the decoders (s5p-mfc, mtk-vcodec), as they don't report > which frame originates from which bitstream buffer and the driver just > picks the most recently consumed OUTPUT buffer to copy the timestamp > from. (s5p-mfc actually "forgets" to set the timestamp in some cases > too...) > > I need to think a bit more about this. > > [snip] >>>>> +Seek >>>>> +==== >>>>> + >>>>> +Seek is controlled by the ``OUTPUT`` queue, as it is the source of coded data. >>>>> +The seek does not require any specific operation on the ``CAPTURE`` queue, but >>>>> +it may be affected as per normal decoder operation. >>>>> + >>>>> +1. Stop the ``OUTPUT`` queue to begin the seek sequence via >>>>> + :c:func:`VIDIOC_STREAMOFF`. >>>>> + >>>>> + * **Required fields:** >>>>> + >>>>> + ``type`` >>>>> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` >>>>> + >>>>> + * The decoder will drop all the pending ``OUTPUT`` buffers and they must be >>>>> + treated as returned to the client (following standard semantics). >>>>> + >>>>> +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` >>>>> + >>>>> + * **Required fields:** >>>>> + >>>>> + ``type`` >>>>> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` >>>>> + >>>>> + * The decoder will start accepting new source bitstream buffers after the >>>>> + call returns. >>>>> + >>>>> +3. Start queuing buffers containing coded data after the seek to the ``OUTPUT`` >>>>> + queue until a suitable resume point is found. >>>>> + >>>>> + .. note:: >>>>> + >>>>> + There is no requirement to begin queuing coded data starting exactly >>>>> + from a resume point (e.g. SPS or a keyframe). Any queued ``OUTPUT`` >>>>> + buffers will be processed and returned to the client until a suitable >>>>> + resume point is found. While looking for a resume point, the decoder >>>>> + should not produce any decoded frames into ``CAPTURE`` buffers. >>>>> + >>>>> + Some hardware is known to mishandle seeks to a non-resume point. Such an >>>>> + operation may result in an unspecified number of corrupted decoded frames >>>>> + being made available on the ``CAPTURE`` queue. Drivers must ensure that >>>>> + no fatal decoding errors or crashes occur, and implement any necessary >>>>> + handling and workarounds for hardware issues related to seek operations. >>>> >>>> Is there a requirement that those corrupted frames have V4L2_BUF_FLAG_ERROR set? >>>> I.e., can userspace detect those currupted frames? >>>> >>> >>> I think the question is whether the kernel driver can actually detect >>> those corrupted frames. We can't guarantee reporting errors to the >>> userspace, if the hardware doesn't actually report them. >>> >>> Could we perhaps keep this an open question and possibly address with >>> some extension that could be an opt in for the decoders that can >>> report errors? >> >> Hmm, how about: If the hardware can detect such corrupted decoded frames, then >> it shall set V4L2_BUF_FLAG_ERROR. >> > > Sounds good to me. > > Actually, let me add a paragraph about error handling in the decoding > section, since it's a general problem, not limited to seeking. I can > then refer to it from the Seek sequence. > >>> >>>>> + >>>>> + .. warning:: >>>>> + >>>>> + In case of the H.264 codec, the client must take care not to seek over a >>>>> + change of SPS/PPS. Even though the target frame could be a keyframe, the >>>>> + stale SPS/PPS inside decoder state would lead to undefined results when >>>>> + decoding. Although the decoder must handle such case without a crash or a >>>>> + fatal decode error, the client must not expect a sensible decode output. >>>>> + >>>>> +4. After a resume point is found, the decoder will start returning ``CAPTURE`` >>>>> + buffers containing decoded frames. >>>>> + >>>>> +.. important:: >>>>> + >>>>> + A seek may result in the `Dynamic resolution change` sequence being >>>>> + iniitated, due to the seek target having decoding parameters different from >>>>> + the part of the stream decoded before the seek. The sequence must be handled >>>>> + as per normal decoder operation. >>>>> + >>>>> +.. warning:: >>>>> + >>>>> + It is not specified when the ``CAPTURE`` queue starts producing buffers >>>>> + containing decoded data from the ``OUTPUT`` buffers queued after the seek, >>>>> + as it operates independently from the ``OUTPUT`` queue. >>>>> + >>>>> + The decoder may return a number of remaining ``CAPTURE`` buffers containing >>>>> + decoded frames originating from the ``OUTPUT`` buffers queued before the >>>>> + seek sequence is performed. >>>>> + >>>>> + The ``VIDIOC_STREAMOFF`` operation discards any remaining queued >>>>> + ``OUTPUT`` buffers, which means that not all of the ``OUTPUT`` buffers >>>>> + queued before the seek sequence may have matching ``CAPTURE`` buffers >>>>> + produced. For example, given the sequence of operations on the >>>>> + ``OUTPUT`` queue: >>>>> + >>>>> + QBUF(A), QBUF(B), STREAMOFF(), STREAMON(), QBUF(G), QBUF(H), >>>>> + >>>>> + any of the following results on the ``CAPTURE`` queue is allowed: >>>>> + >>>>> + {A’, B’, G’, H’}, {A’, G’, H’}, {G’, H’}. >>>> >>>> Isn't it the case that if you would want to avoid that, then you should call >>>> DECODER_STOP, wait for the last buffer on the CAPTURE queue, then seek and >>>> call DECODER_START. If you do that, then you should always get {A’, B’, G’, H’}. >>>> (basically following the Drain sequence). >>> >>> Yes, it is, but I think it depends on the application needs. Here we >>> just give a primitive to change the place in the stream that's being >>> decoded (or change the stream on the fly). >>> >>> Actually, with the timestamp copy, I guess we wouldn't even need to do >>> the DECODER_STOP, as we could just discard the CAPTURE buffers until >>> we get one that matches the timestamp of the first OUTPUT buffer after >>> the seek. >>> >>>> >>>> Admittedly, you typically want to do an instantaneous seek, so this is probably >>>> not what you want to do normally. >>>> >>>> It might help to have this documented in a separate note. >>>> >>> >>> The instantaneous seek is documented below. I'm not sure if there is >>> any practical need to document the other case, but I could add a >>> sentence like below to the warning above. What do you think? >>> >>> The ``VIDIOC_STREAMOFF`` operation discards any remaining queued >>> ``OUTPUT`` buffers, which means that not all of the ``OUTPUT`` buffers >>> queued before the seek sequence may have matching ``CAPTURE`` buffers >>> produced. For example, given the sequence of operations on the >>> ``OUTPUT`` queue: >>> >>> QBUF(A), QBUF(B), STREAMOFF(), STREAMON(), QBUF(G), QBUF(H), >>> >>> any of the following results on the ``CAPTURE`` queue is allowed: >> >> is allowed -> are allowed >> > > Only one can happen at given time, so I think singular is correct > here? (i.e. Any [...] is allowed) I think you are right. > > [snip] >>>>> diff --git a/Documentation/media/uapi/v4l/vidioc-g-fmt.rst b/Documentation/media/uapi/v4l/vidioc-g-fmt.rst >>>>> index 3ead350e099f..0fc0b78a943e 100644 >>>>> --- a/Documentation/media/uapi/v4l/vidioc-g-fmt.rst >>>>> +++ b/Documentation/media/uapi/v4l/vidioc-g-fmt.rst >>>>> @@ -53,6 +53,13 @@ devices that is either the struct >>>>> member. When the requested buffer type is not supported drivers return >>>>> an ``EINVAL`` error code. >>>>> >>>>> +A stateful mem2mem decoder will not allow operations on the >>>>> +``V4L2_BUF_TYPE_VIDEO_CAPTURE`` or ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE`` >>>>> +buffer type until the corresponding ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or >>>>> +``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE`` buffer type is configured. If such an >>>>> +operation is attempted, drivers return an ``EACCES`` error code. Refer to >>>>> +:ref:`decoder` for more details. >>>> >>>> This isn't right. EACCES is returned as long as the output format resolution is >>>> unknown. If it is set explicitly, then this will work without an error. >> >> Ah, sorry, I phrased that poorly. Let me try again: >> >> This isn't right. EACCES is returned for CAPTURE operations as long as the >> output format resolution is unknown or the CAPTURE format is explicitly set. >> If the CAPTURE format is set explicitly, then this will work without an error. > > We don't allow directly setting CAPTURE format explicitly either, > because the driver wouldn't have anything to validate the format > against. We allow the client to do it indirectly, by setting the width > and height of the OUTPUT format, which unblocks the CAPTURE format > operations, because the driver can then validate against the OUTPUT > (coded) format. You are completely right, just forget what I said. Regards, Hans > > Best regards, > Tomasz >