Re: [RFC] Stateful codecs and requirements for compressed formats

Dave Stevenson <dave.stevenson@xxxxxxxxxxxxxxx> · Fri, 28 Jun 2019 16:21:54 +0100

Hi Hans

On Fri, 28 Jun 2019 at 15:34, Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>
> Hi all,
>
> I hope I Cc-ed everyone with a stake in this issue.
>
> One recurring question is how a stateful encoder fills buffers and how a stateful
> decoder consumes buffers.
>
> The most generic case is that an encoder produces a bitstream and just fills each
> CAPTURE buffer to the brim before continuing with the next buffer.
>
> I don't think there are drivers that do this, I believe that all drivers just
> output a single compressed frame. For interlaced formats I understand it is either
> one compressed field per buffer, or two compressed fields per buffer (this is
> what I heard, I don't know if this is true).

>From the discussion that started this thread, with H264 and similar,
does the V4L2 buffer contain just the frame data, or the SPS/PPS
headers as well.

> In any case, I don't think this is specified anywhere. Please correct me if I am
> wrong.
>
> The latest stateful codec spec is here:
>
> https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-mem2mem.html
>
> Assuming what I described above is indeed the case, then I think this should
> be documented. I don't know enough if a flag is needed somewhere to describe
> the behavior for interlaced formats, or can we leave this open and have userspace
> detect this?
>
>
> For decoders it is more complicated. The stateful decoder spec is written with
> the assumption that userspace can just fill each OUTPUT buffer to the brim with
> the compressed bitstream. I.e., no need to split at frame or other boundaries.
>
> See section 4.5.1.7 in the spec.
>
> But I understand that various HW decoders *do* have limitations. I would really
> like to know about those, since that needs to be exposed to userspace somehow.
>
> Specifically, the venus decoder needs to know the resolution of the coded video
> beforehand and it expects a single frame per buffer (how does that work for
> interlaced formats?).
>
> Such requirements mean that some userspace parsing is still required, so these
> decoders are not completely stateful.
>
> Can every codec author give information about their decoder/encoder?
>
> I'll start off with my virtual codec driver:
>
> vicodec: the decoder fully parses the bitstream. The encoder produces a single
> compressed frame per buffer. This driver doesn't yet support interlaced formats,
> but when that is added it will encode one field per buffer.

On BCM283x:

The underlying decoder will accept anything, but giving it a single
frame per buffer reduces latency as the bitstream parser gets kicked
earlier. Based on previous discussions I am setting the flag so that
it expects one compressed frame per buffer, but I don't believe it
goes wrong should that not be the case (it'll just waste a bit of
processing effort).
It'll parse the headers and produce a V4L2_EVENT_SOURCE_CHANGE event
should the capture queue format not match the stream parameters.
Interlacing isn't supported yet (it's on the list), but I believe the
hardware produces the equivalent to V4L2_FIELD_INTERLACED_[TB|BT].

The encoder currently spits out the H264 SPS/PPS headers as a separate
V4L2 buffer, and then one compressed frame per V4L2 buffer (provided
the buffer is big enough). Should
V4L2_CID_MPEG_VIDEO_REPEAT_SEQ_HEADER be set, then it will repeat the
headers in an independent V4L2 buffer before each I frame.
I'm quite happy to amend this should we have a decent spec of what is
required. As I've never found a spec it's been trial and error until
now.
There is no interlaced support available.

  Dave