Re: Stateless Encoding uAPI Discussion and Proposal

Andrzej Pietrasiewicz <andrzej.p@xxxxxxxxxxxxx> · Wed, 9 Aug 2023 19:24:03 +0200

Hi Paul & Hans,

W dniu 9.08.2023 o 16:43, Paul Kocialkowski pisze:
Hi Hans,

On Wed 26 Jul 23, 10:18, Hans Verkuil wrote:
On 11/07/2023 20:18, Nicolas Dufresne wrote:
Le mardi 11 juillet 2023 à 19:12 +0200, Paul Kocialkowski a écrit :
Hi everyone!

After various discussions following Andrzej's talk at EOSS, feedback from the
Media Summit (which I could not attend unfortunately) and various direct
discussions, I have compiled some thoughts and ideas about stateless encoders
support with various proposals. This is the result of a few years of interest
in the topic, after working on a PoC for the Hantro H1 using the hantro driver,
which turned out to have numerous design issues.

I am now working on a H.264 encoder driver for Allwinner platforms (currently
focusing on the V3/V3s), which already provides some usable bitstream and will
be published soon.

This is a very long email where I've tried to split things into distinct topics
and explain a few concepts to make sure everyone is on the same page.

# Bitstream Headers

Stateless encoders typically do not generate all the bitstream headers and
sometimes no header at all (e.g. Allwinner encoder does not even produce slice
headers). There's often some hardware block that makes bit-level writing to the
destination buffer easier (deals with alignment, etc).

The values of the bitstream headers must be in line with how the compressed
data bitstream is generated and generally follow the codec specification.
Some encoders might allow configuring all the fields found in the headers,
others may only allow configuring a few or have specific constraints regarding
which values are allowed.

As a result, we cannot expect that any given encoder is able to produce frames
for any set of headers. Reporting related constraints and limitations (beyond
profile/level) seems quite difficult and error-prone.

So it seems that keeping header generation in-kernel only (close to where the
hardware is actually configured) is the safest approach.

This seems to match with what happened with the Hantro VP8 proof of concept. The
encoder does not produce the frame header, but also, it produces 2 encoded
buffers which cannot be made contiguous at the hardware level. This notion of
plane in coded data wasn't something that blended well with the rest of the API
and we didn't want to copy in the kernel while the userspace would also be
forced to copy to align the headers. Our conclusion was that it was best to
generate the headers and copy both segment before delivering to userspace. I
suspect this type of situation will be quite common.

# Codec Features

Codecs have many variable features that can be enabled or not and specific
configuration fields that can take various values. There is usually some
top-level indication of profile/level that restricts what can be used.

This is a very similar situation to stateful encoding, where codec-specific
controls are used to report and set profile/level and configure these aspects.
A particularly nice thing about it is that we can reuse these existing controls
and add new ones in the future for features that are not yet covered.

This approach feels more flexible than designing new structures with a selected
set of parameters (that could match the existing controls) for each codec.

Though, reading more into this emails, we still have a fair amount of controls
to design and add, probably some compound controls too ?

I expect that for stateless encoders support for read-only requests will be needed:

https://patchwork.linuxtv.org/project/linux-media/list/?series=5647

I worked on that in the past together with dynamic control arrays. The dynamic
array part was merged, but the read-only request part wasn't (there was never a
driver that actually needed it).

I don't know if that series still applies, but if there is a need for it then I
can rebase it and post an RFCv3.

So if I understand this correctly (from a quick look), this would be to allow
stateless encoder drivers to attach a particular control value to a specific
returned frame?

I guess this would be a good match to return statistics about the encoded frame.
However that would probably be expressed in a hardware-specific way so it
seems preferable to not expose this to userspace and handle it in-kernel
instead.

What's really important for userspace to know (in order to do user-side
rate-control, which we definitely want to support) is the resulting bitstream
size. This is already available with bytesused.

So all in all I think we're good with the current status of request support.

Yup. I agree. Initially, while working on VP8 encoding we introduced (read-only)
requests on the capture queue, but they turned out not to be useful in this
context and we removed them.

Regards.

Andrzej

Cheers,

Paul