Re: [RFC RESEND 2/3] media: uapi: Add VP9 stateless decoder controls

Nicolas Dufresne <nicolas@xxxxxxxxxxxx> · Wed, 05 May 2021 13:36:06 -0400

Hi Hans,

just a partial reply, I'll let Andrzej extend.

Le jeudi 29 avril 2021 à 12:20 +0200, Hans Verkuil a écrit :
> > +      - ``frame_width_minus_1``
> > +      - Add 1 to get the frame width expressed in pixels.
> > +    * - __u16
> > +      - ``frame_height_minus_1``
> > +      - Add 1 to get the frame height expressed in pixels.
> 
> These two fields are weird. Isn't this defined by setting the output format?
> And why the 'minus_1'?
> 
> > +    * - __u16
> > +      - ``render_width_minus_1``
> > +      - Add 1 to get the expected render width expressed in pixels. This is
> > +        not used during the decoding process but might be used by HW
> > scalers to
> > +        prepare a frame that's ready for scanout.
> > +    * - __u16
> > +      - render_height_minus_1
> > +      - Add 1 to get the expected render height expressed in pixels. This
> > is
> > +        not used during the decoding process but might be used by HW
> > scalers to
> > +        prepare a frame that's ready for scanout.
> 
> No idea what these fields are about. I suspect this can be defined by setting
> the capture format, but I'm not sure.

We have the same for other codecs. Each codec bitstream include the coded and
the display size in some form. For H264/H265 that was passed as as number of
macroblock and a crop rectangle. For VP9 value - 1 is as defined in the spec. As
0 is not valid, the boolean coder can save some bits when storing the value.
Though, for parameters, we usually start with the way it's encoded first, and
change it only if we think it make sense. We'll take note and consider this
whenever we have a second driver to look at.

Now, why do we include both here. Well in fact, the HW may have some extra
constraints. Which means there will be up to 3 frame sizes that matters. We
can't also determine if the display/render or coded size will be used for minim
CAPTURE format, as this will in fact depends if a post processor will be used or
not. 

So to recap, we limit the CAPTURE format base on this information, and not the
opposite. The width/height on OUTPUT FMT has been define as meaningless in the
spec (to align with stateful I suppose ?). This way, the driver got all the
information aligned with how it works inside the codec, without having to do a
translation dance, and then properly implement CAPTURE TRY_FMT base on that.

To make an analogy with stateful codec, this replaces the queuing of a frame
that contains codec headers. We skip the SRC_CH events, since this is no longer
asynchronous.

Nicolas