Re: [RFC] Request API and V4L2 capabilities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 22, 2018 at 11:10 PM Paul Kocialkowski
<paul.kocialkowski@xxxxxxxxxxx> wrote:
>
> Hi,
>
> On Tue, 2018-08-21 at 17:52 +0900, Tomasz Figa wrote:
> > Hi Hans, Paul,
> >
> > On Mon, Aug 6, 2018 at 6:29 PM Paul Kocialkowski
> > <paul.kocialkowski@xxxxxxxxxxx> wrote:
> > >
> > > On Mon, 2018-08-06 at 11:23 +0200, Hans Verkuil wrote:
> > > > On 08/06/2018 11:13 AM, Paul Kocialkowski wrote:
> > > > > Hi,
> > > > >
> > > > > On Mon, 2018-08-06 at 10:32 +0200, Hans Verkuil wrote:
> > > > > > On 08/06/2018 10:16 AM, Paul Kocialkowski wrote:
> > > > > > > On Sat, 2018-08-04 at 15:50 +0200, Hans Verkuil wrote:
> > > > > > > > Regarding point 3: I think this should be documented next to the pixel format. I.e.
> > > > > > > > the MPEG-2 Slice format used by the stateless cedrus codec requires the request API
> > > > > > > > and that two MPEG-2 controls (slice params and quantization matrices) must be present
> > > > > > > > in each request.
> > > > > > > >
> > > > > > > > I am not sure a control flag (e.g. V4L2_CTRL_FLAG_REQUIRED_IN_REQ) is needed here.
> > > > > > > > It's really implied by the fact that you use a stateless codec. It doesn't help
> > > > > > > > generic applications like v4l2-ctl or qv4l2 either since in order to support
> > > > > > > > stateless codecs they will have to know about the details of these controls anyway.
> > > > > > > >
> > > > > > > > So I am inclined to say that it is not necessary to expose this information in
> > > > > > > > the API, but it has to be documented together with the pixel format documentation.
> > > > > > >
> > > > > > > I think this is affected by considerations about codec profile/level
> > > > > > > support. More specifically, some controls will only be required for
> > > > > > > supporting advanced codec profiles/levels, so they can only be
> > > > > > > explicitly marked with appropriate flags by the driver when the target
> > > > > > > profile/level is known. And I don't think it would be sane for userspace
> > > > > > > to explicitly set what profile/level it's aiming at. As a result, I
> > > > > > > don't think we can explicitly mark controls as required or optional.
> >
> > I'm not sure this is entirely true. The hardware may need to be
> > explicitly told what profile the video is. It may even not be the
> > hardware, but the driver itself too, given that the profile may imply
> > the CAPTURE pixel format, e.g. for VP9 profiles:
> >
> > profile 0
> > color depth: 8 bit/sample, chroma subsampling: 4:2:0
> > profile 1
> > color depth: 8 bit, chroma subsampling: 4:2:0, 4:2:2, 4:4:4
> > profile 2
> > color depth: 10–12 bit, chroma subsampling: 4:2:0
> > profile 3
> > color depth: 10–12 bit, chroma subsampling: 4:2:0, 4:2:2, 4:4:4
> >
> > (reference: https://en.wikipedia.org/wiki/VP9#Profiles)
>
> I think it would be fair to expect userspace to select the right
> destination format (and maybe have the driver error if there's a
> mismatch with the meta-data) instead of having the driver somewhat
> expose what format should be used.

There are many different memory representations of the same physical
YUV format, just for YUV 4:2:0: NV12, NV12M, NV21, NV21M, YUV420,
YUV420M, YVU420, YVU420M. It depends on hardware and driver which one
would be available for given stream to decode. How is the user space
expected to know which one is, without querying the driver first?

>
> But maybe this would be an API violation, since all the enumerated
> formats are probably supposed to be selectable?

Correct.

>
> We could also look at it the other way round and consider that selecting
> an exposed format is always legit, but that it implies passing a
> bitstream that matches it or the driver will error (because of an
> invalid bitstream passed, not because of a "wrong" selected format).

As per above, it's unlikely that a generic user space can set the
right format. It may be able to narrow down the list of exposed
formats to those which make sense for the generic information about
the stream it has, e.g. VP9 profile 0 -> YUV 4:2:0, but there may
still be a constraint on which representation is allowed depending on
stream features.

>
> As far as I understood, the profile/level information is there to
> indicate a set of supported features by the decoder, not as an
> information used for the decoding process. Each corresponding feature is
> enabled or not in the bitstream meta-data and that's all the information
> the decoder really needs.
>
> This is why I think that setting the profile/level explicitly is not
> justified by the nature of the process and adding it only for
> convenience or marking whether controls are optional doesn't seem
> justified at this point, in my opinion.

Okay, it's actually a good point. Whether given format can be
supported is not entirely dictated by profile/level. There may be some
initial negotiation needed, involving the user space setting the
parsed stream parameters through respective controls.

>
> > > > > > > I also like the idea that it should instead be implicit and that the
> > > > > > > documentation should detail which specific stateless metadata controls
> > > > > > > are required for a given profile/level.
> > > > > > >
> > > > > > > As for controls validation, the approach followed in the Cedrus driver
> > > > > > > is to check that the most basic controls are filled and allow having
> > > > > > > missing controls for those that match advanced profiles.
> > > > > > >
> > > > > > > Since this approach feels somewhat generic enough to be applied to all
> > > > > > > stateless VPU drivers, maybe this should be made a helper in the
> > > > > > > framework?
> > > > > >
> > > > > > Sounds reasonable. Not sure if it will be in the first version, but it is
> > > > > > easy to add later.
> > > > >
> > > > > Definitely, I don't think this is such a high priority for now either.
> > > > >
> >
> > We may want to put strict requirements on what controls are provided
> > for given codec+profile/level. Otherwise we might get some user space
> > that doesn't provide some of them and works only by luck, e.g. because
> > some hardware defaults on initial drivers luckily match the needed
> > values. Even if we don't validate it in the code yet, we should put a
> > big warning saying that not providing the required controls would
> > result in undefined behavior.
>
> I don't think having such strict requirements are a good thing. Even
> with the level/profile made explicit, what if the video under-uses its
> features and thus legitimately doesn't need to have all the controls
> that could be supported with the level/profile? This can probably also
> be frame-specific, so some frames could require more controls than
> others.

That's exactly the best first step towards an user space that relies
on unspecified behavior. Even if the video doesn't use all the
features as per given profile/level, user space should be expected to
initialize the state with reasonable defaults, since it's essentially
for the driver/hardware to decide which state it needs to do the
decoding.

Whether profile/level is the right indicator to judge what's required
and what's not is a different question. It indeed may not be. However,
we need to clearly state what controls and when the application needs
to set in the specification. I'm not talking about requirements as for
bailing out with an error code, but exactly as for having that
included in the specification and correct behavior only guaranteed if
followed correctly.

>
> This also leads me to believe that the profile/level indication should
> be used as a support indication to userspace, not as a way to expose the
> required features for decoding to the kernel
>
> We could still enforce checks for the most basic controls (that are used
> for all types of slices to decode) and error if they are missing. We
> could also check the bits that indicate more advanced features in these
> basic controls and decide what other controls are required from that.
>
> > > > > > > In addition, I see a need for exposing the maximum profile/level that
> > > > > > > the driver supports for decoding. I would suggest reusing the already-
> > > > > > > existing dedicated controls used for encoding for this purpose. For
> > > > > > > decoders, they would be used to expose the (read-only) maximum
> > > > > > > profile/level that is supported by the hardware and keep using them as a
> > > > > > > settable value in a range (matching the level of support) for encoders.
> > > > > > >
> > > > > > > This is necessary for userspace to determine whether a given video can
> > > > > > > be decoded in hardware or not. Instead of half-way decoding the video
> > > > > > > (ending up in funky results), this would easily allow skipping hardware
> > > > > > > decoding and e.g. falling back on software decoding.
> > > > > >
> > > > > > I think it might be better to expose this through new read-only bitmask
> > > > > > controls: i.e. a bitmask containing the supported profiles and levels.
> > > > >
> > > > > It seems that this is more or less what the coda driver is doing for
> > > > > decoding actually, although it uses a menu control between min/max
> > > > > supported profile/levels, with a mask to "blacklist" the unsupported
> > > > > values. Then, the V4L2_CTRL_FLAG_READ_ONLY flag is set to keep the
> > > > > control read-only.
> > > > >
> > > > > > Reusing the existing controls for a decoder is odd since there is not
> > > > > > really a concept of a 'current' value since you just want to report what
> > > > > > is supported. And I am not sure if all decoders can report the profile
> > > > > > or level that they detect.
> > > > >
> > > > > Is that really a problem when the READ_ONLY flag is set? I thought it
> > > > > was designed to fit this specific case, when the driver reports a value
> > > > > that userspace cannot affect.
> > > >
> > > > Well, for read-only menu controls the current value of the control would
> > > > have to indicate what the current profile/level is that is being decoded.
> > > >
> > > > That's not really relevant since what you want is just to query the
> > > > supported profiles/levels. A read-only bitmask control is the fastest
> > > > method (if only because using a menu control requires the application to
> > > > enumerate all possibilities with QUERYMENU).
> >
> > Besides querying for supported profiles,
> >  - For stateless codecs we also need to set the profile, since the
> > codec itself does only the number crunching.
>
> I disagree here, see above.
>

Indeed, profile is normally a function of various parameters included
in the more general decoder state (e.g. bitstream headers).

> >  - For stateful codecs, the decoder would also report the detected
> > profile after parsing the bitstream (although this is possibly not of
> > a big importance to the user space).
>
> I don't follow the logic behind this. This means informing userspace of
> the capabilities required to decode the video that the VPU is currently
> decoding... so that it can decided whether to decode it?

That's what some of current drivers do, but I also fail to see it
being useful. We should make sure that there is no user space relying
on this behavior, if we intend to change it.

>
> > As for querying itself, there is still more to it than could be
> > handled with just a read only control. To detect what CAPTURE formats
> > are supported for given profile, one would have to set the profile
> > control first and then use ENUM_FMT.
>
> Or consider that the bitstream is invalid if it doesn't match the
> selected format and let userspace pick the appropriate format.

See above. The appropriate format is only known to the driver.

Best regards,
Tomasz




[Index of Archives]     [Linux Input]     [Video for Linux]     [Gstreamer Embedded]     [Mplayer Users]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux