Re: [RFC] Stateful codecs and requirements for compressed formats

Hans Verkuil <hverkuil@xxxxxxxxx> · Wed, 10 Jul 2019 11:14:17 +0200

On 7/3/19 10:32 AM, Tomasz Figa wrote:
> Hi Hans,
> 
> On Fri, Jun 28, 2019 at 11:34 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>>
>> Hi all,
>>
>> I hope I Cc-ed everyone with a stake in this issue.
>>
>> One recurring question is how a stateful encoder fills buffers and how a stateful
>> decoder consumes buffers.
>>
>> The most generic case is that an encoder produces a bitstream and just fills each
>> CAPTURE buffer to the brim before continuing with the next buffer.
>>
>> I don't think there are drivers that do this, I believe that all drivers just
>> output a single compressed frame. For interlaced formats I understand it is either
>> one compressed field per buffer, or two compressed fields per buffer (this is
>> what I heard, I don't know if this is true).
>>
>> In any case, I don't think this is specified anywhere. Please correct me if I am
>> wrong.
>>
>> The latest stateful codec spec is here:
>>
>> https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-mem2mem.html
>>
>> Assuming what I described above is indeed the case, then I think this should
>> be documented. I don't know enough if a flag is needed somewhere to describe
>> the behavior for interlaced formats, or can we leave this open and have userspace
>> detect this?
>>
> 
> From Chromium perspective, we don't have any use case for encoding
> interlaced contents, so we'll be okay with whatever the interested
> parties decide on. :)
> 
>>
>> For decoders it is more complicated. The stateful decoder spec is written with
>> the assumption that userspace can just fill each OUTPUT buffer to the brim with
>> the compressed bitstream. I.e., no need to split at frame or other boundaries.
>>
>> See section 4.5.1.7 in the spec.
>>
>> But I understand that various HW decoders *do* have limitations. I would really
>> like to know about those, since that needs to be exposed to userspace somehow.
> 
> AFAIK mtk-vcodec needs H.264 SPS and PPS to be split into their own
> separate buffers. I believe it also needs 1 buffer to contain exactly
> 1 frame and 1 frame to be fully contained inside 1 buffer.
> 
> Venus also needed 1 buffer to contain exactly 1 frame and 1 frame to
> be fully contained inside 1 buffer. It used to have some specific
> requirements regarding SPS and PPS too, but I think that was fixed in
> the firmware.
> 
>>
>> Specifically, the venus decoder needs to know the resolution of the coded video
>> beforehand
> 
> I don't think that's true for venus. It does parsing and can detect
> the resolution.
> 
> However that's probably the case for coda...
> 
>> and it expects a single frame per buffer (how does that work for
>> interlaced formats?).
>>
>> Such requirements mean that some userspace parsing is still required, so these
>> decoders are not completely stateful.
>>
>> Can every codec author give information about their decoder/encoder?
>>
>> I'll start off with my virtual codec driver:
>>
>> vicodec: the decoder fully parses the bitstream. The encoder produces a single
>> compressed frame per buffer. This driver doesn't yet support interlaced formats,
>> but when that is added it will encode one field per buffer.
>>
>> Let's see what the results are.
> 
> s5p-mfc:
>  decoder: fully parses the bitstream,
>  encoder: produces a single frame per buffer (haven't tested interlaced stuff)
> 
> mtk-vcodec:
>  decoder: expects separate buffers for SPS, PPS and full frames
> (including some random stuff like SEIMessage),

Do you mean that the SPS/PPS etc. should all be in separate buffers? I.e.
you can't combine SPS and PPS in a single buffer?

Regards,

	Hans

>  encoder: produces a single frame per buffer (haven't tested interlaced stuff)
> 
> Best regards,
> Tomasz
>