Re: Proposal: A third buffer type for the reconstruction buffers in V4L2 M2M encoder

Hsia-Jun Li <Randy.Li@xxxxxxxxxxxxx> · Tue, 28 Jun 2022 11:50:44 +0800

On 6/28/22 03:22, Nicolas Dufresne wrote:
CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.

Hi,

Le mardi 28 juin 2022 à 00:12 +0800, ayaka a écrit :
Hi All

I think we need a separate buffer queue that manages those reconstruction or
auxiliary buffers for those V4L2 M2M drivers.

There are some drivers already allocating internal buffers as the
reconstruction buffers for its encoding instance. The
driver/media/platform/chips-media is one here, its coda_alloc_context_buf()
would allocate the max allowed references for its instance as the
reconstruction buffers. You can't control the lifetime of the reconstruction
buffer here, which means you can't control which buffers would be as the
reference.

It may look not bad for a hardware encoder that has a control chip that could
do some encoding bitrate control. For those stateless encoders, which are
controlled by the user completely, it would be better to let the user decide
the lifetime of a reconstruction buffer.

For the SVC situation, a layer may refer to a buffer in another layer, which
is encoded many times ago.

I would love to see a proposal for SVC support, that would greatly help to
understand where external reconstructed frames buffer management falls in. Just
"controlling lifetime" is to weak of a justification for the added complexity.

There are three variants of SVC here, the most simple one is SVC-T.
layer 0   Intra A B C D Intra E F G H I
layer 1   ^--------A1     ^----------B2
Even the Hantro H1 could support SVC-T.
The major requirement here is it could refer to a reconstruction buffer 
which is encoded a long time ago.

I am not sure which way is better, I would implement one from the feedback.
One is reusing V4L2_BUF_TYPE_VIDEO_OVERLAY, it would support REQ_BUFS,

I don't think a re-purpose is a good idea.
All right, I would turn this idea down.

SET_FMT, GET_FMT, QBUF, and DQBUF besides the existing m2m operations. Another
idea is extending those ioctls to the media node that the stateless m2m driver
uses for allocation request_fd token.

CODA goes passed this, it hides an internal pixel format, which have no use
outside of the chip. We'd have to introduce more vendor formats in order to
Maybe the user didn't need to know the exact pixel format of it. But 
user still need to know the size of buffer here. I would like to let 
user decided how many buffers it should allocate.

allow S_FMT and friend. Having to queue reference buffers also requires in depth
knowledge of the decoding process, which is for stateful decoder a miss-fit. I
For the chips-media's encoder, its processor(bit processor) won't decide 
which frame would be used for the references. Actually, the stateless 
and stateful encoder doesn't have much difference here. And things could 
be easy if we don't invoke bi-inter here, just remember to mark a 
slice/rewrite a slice's header as the long-term reference.

think.

Please notice that the reconstruction could use a different pixel format than
the one used in input frames. For example, Hantro H1 could use the NV12_4L4 in
its reconstruction buffer and an FBC format in the later generation of chips-
media's codecs.
Also, there are some decoders having an online post-processor. This means it
can't do pixel format transforming independently. The driver for those devices
may need this.

Even for decoder, when then is an inline post-processor, an extra set of buffer
is allocated internally.

I'm not sure what I could propose on top here, since there is very little
skeleton in this proposal. It is more a feature request, so stepping back a
We may assume this feature would be need for the encoder first.
I don't know what has been revealed for our Synaptics VideoSmart series.

little, perhaps we should start with real-life use cases that needs this and
from there we can think of a flow ?
I would use the Hantro D1 as a sample here(the VideoSmart series don't 
use this IP and have its own much powerful proprietary video decoder). 
For example, D1 could product another image which has a lower resolution 
and in the non-tiled pixel format. A user would use this lower 
resolution image rendering for the overlay plane of Linux DRM. Although 
we know the post-processor of Hantro D1 could work offline but the 
in-pipeline way would be more fast(It would be triggered by a macroblock 
not after a whole image is decoded).

Sincerely
Randy

--
Hsia-Jun(Randy) Li