Re: Hantro H1 Encoding Upstreaming

Paul Kocialkowski <paulk@xxxxxxxxxxx> · Wed, 15 Jan 2025 16:03:37 +0100

Hi folks,

Le Tue 14 Jan 25, 11:16, Nicolas Dufresne a écrit :
> despite Andrzej having left the community, we are not giving up on the encoder
> work. In 2025, we aim at working more seriously on the V4L2 spec, as just
> writing driver won't cut it. Each class of codecs needs a general workflow spec
> similar to what we have already for stateful encoder/decoder and stateless
> decoder.
> 
> - https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-decoder.html
> - https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-encoder.html
> - https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-stateless-decoder.html
> 
> It is on top of this, that for each codec we have to add controls (mostly
> compound) specifics and details that suites stateless accelerators.
> 
> From a community stand point, the most important focus is to write and agree on
> spec and controls. Once we have that, vendors will be able to slowly move away
> from their custom solution, and compete on actual hardware rather then
> integration.

Thanks for the continued interest in this topic, I am also still interested in
pushing it forward and defining a mainline API for stateless encore that fits
the bill.

> It is also time to start looking toward the future, since Hantro H1 is very
> limited and ancient encoder. On same brand, if someone could work on VC8000E
> shipped on IMX8M Plus, or Rockchip codecs, that will certainly help progress. We
> can also get inspiration from many other stateless encoding APIs now, notably
> VA, DXVA and Vulkan Video.

The VC8000E on the i.MX8MP is definitely my next hardware of interest here.
I will have time available to work on it in the near future.

> Of course, folks likes to know when this will happen, stateless decoders took 5
> years from start to the first codec being merged, hopefully we don't beat that
> record. I personally aim for producing work during the summer, and mostly focus
> on the spec.

To be fair we are not starting from scratch and seem to have a good momentum
here so I am hopeful it will not take as long!

> Its obvious for me that testing on H1 with a GStreamer
> implementation is the most productive, though I have strong interest in having
> an ecosystem of drivers. A second userspace implementation, perhaps ffmpeg ?,
> could also be useful.

Would be glad to not have to work on the GStreamer side and focus on kernel
work instead. Sofar we can already aim to support:
- Hantro H1
- Hantro H2/VC8000E
- Allwinner Video Engine

> If you'd like to take a bite, this is a good thread to discuss forward. Until
> the summer, I planned to reach to Paul, who made this great presentation [1] at
> FOSDEM last year and start moving the RFC into using these ideas. One of the
> biggest discussion is rate control, it is clear to me that modern HW integrated
> RC offloading, though some HW specific knobs or even firmware offloading, and
> this is what Paul has been putting some thought into.

In terms of RC offloading, what's I've seen in the Hantro H1 is a checkpoint
mechanism that allows making per-slice QP adjustments around the global picture
QP to bit the bill in terms of size. This can be a desirable thing if the use
case is to stick to a given bitrate strictly.

There's also the regions of interest that are supported by many (most?) encoders
and allow region-based QP changes (typically as offset). The number of available
slots is hardware-specific.

In addition the H1 provides some extra statistics such as the "average"
resulting QP when on of these methods is used.

I guess my initial point about rate control was that it would be easier for
userspace to be able to choose a rate-control strategy directly and to have
common implementations kernel-side that would apply to all codecs. It also
allows leveraging hardware features without userspace knowing about them.

However the main drawback is that there will always be a need for a more
specific/advanced use-case than what the kernel is doing (e.g. using a npu),
which would need userspace to have more control over the encoder.

So a more direct interface would be required to let userspace do rate-control.
At the end of the day, I think it would make more sense to expose these encoders
for what they are and deal with the QP and features directly through the uAPI
and avoid any kernel-side rate-control. Hardware-specific features that need to
be configured and may return stats would just have extra controls for those.

So all in all we'd need a few new controls to configure the encode for codecs
(starting with h.264) and also some to provide encode stats (e.g. requested qp,
average qp). It feels like we could benefit from existing stateful encoder
controls for various bitstream parameters.

Then userspace would be responsible for configuring each encode run with a
target QP value, picture type and list of references. We'd need to also inform
userspace of how many references are supported.

Another topic of interest is bitstream generation. I believe it would be easier
for the kernel side to generate those (some hardware has specific registers to
write them) based on the configuration provided by userspace through controls.
It is also useful to be able to regenerate them on demand. I am not sure if
there would be interest in more precise tracking of bitstream headers (e.g.
H.264 PPS and SPS that have ids) and be able to bind them to specific encode
runs.

We could have some common per-codec bitstream generation v4l2 code with either
a cpu buffer access backend or a driver-specific implementation for writing the
bits. I already have a base for this in my cedrus h264 encoder work:
https://github.com/bootlin/linux/blob/cedrus/h264-encoding/drivers/staging/media/sunxi/cedrus/cedrus_enc_h264.c#L722

Last words about private driver buffers (such as motion vectors and
reconstruction buffers), I think they should remain private and unseen from
userspace. We could add something extra to the uAPI later if there is really a
need to access those.

Cheers,

Paul

> If decoders have progressed so much in quality in the last few years, it is
> mostly before we have better ways to test them. It is also needed to start
> thinking how do we want to test our encoders. The stateful scene is not all
> green, with a very organic groth and difficult to unify set of encoders. And we
> have no metric of how good or bad they are either.
>
> regards,
> Nicolas
> 
> Le lundi 13 janvier 2025 à 18:08 -0300, Daniel Almeida a écrit :
> > +cc Nicolas
> > 
> > 
> > Hey Adam,
> > 
> > 
> > > 
> > > Daniel,
> > > 
> > > Do you know if anyone will be picking up the H1 encoder?
> > > 
> > > adam
> > > > 
> > > > — Daniel
> > > > 
> > > 
> > 
> > I think my colleague Nicolas is the best person to answer this.
> > 
> > — Daniel
> 

-- 
Paul Kocialkowski,

Independent contractor - sys-base - https://www.sys-base.io/
Free software developer - https://www.paulk.fr/

Expert in multimedia, graphics and embedded hardware support with Linux.
Attachment:
signature.asc

Description: PGP signature