Re: [RFCv4,19/21] media: vim2m: add request support

Tomasz Figa <tfiga@xxxxxxxxxxxx> · Mon, 12 Mar 2018 17:29:06 +0900

On Mon, Mar 12, 2018 at 5:25 PM, Paul Kocialkowski
<paul.kocialkowski@xxxxxxxxxxx> wrote:
> Hi,
>
> On Mon, 2018-03-12 at 17:15 +0900, Tomasz Figa wrote:
>> Hi Paul, Dmitry,
>>
>> On Mon, Mar 12, 2018 at 5:10 PM, Paul Kocialkowski
>> <paul.kocialkowski@xxxxxxxxxxx> wrote:
>> > Hi,
>> >
>> > On Sun, 2018-03-11 at 22:42 +0300, Dmitry Osipenko wrote:
>> > > Hello,
>> > >
>> > > On 07.03.2018 19:37, Paul Kocialkowski wrote:
>> > > > Hi,
>> > > >
>> > > > First off, I'd like to take the occasion to say thank-you for
>> > > > your
>> > > > work.
>> > > > This is a major piece of plumbing that is required for me to add
>> > > > support
>> > > > for the Allwinner CedarX VPU hardware in upstream Linux. Other
>> > > > drivers,
>> > > > such as tegra-vde (that was recently merged in staging) are also
>> > > > badly
>> > > > in need of this API.
>> > >
>> > > Certainly it would be good to have a common UAPI. Yet I haven't
>> > > got my
>> > > hands on
>> > > trying to implement the V4L interface for the tegra-vde driver,
>> > > but
>> > > I've taken a
>> > > look at Cedrus driver and for now I've one question:
>> > >
>> > > Would it be possible (or maybe already is) to have a single IOCTL
>> > > that
>> > > takes input/output buffers with codec parameters, processes the
>> > > request(s) and returns to userspace when everything is done?
>> > > Having 5
>> > > context switches for a single frame decode (like Cedrus VAAPI
>> > > driver
>> > > does) looks like a bit of overhead.
>> >
>> > The V4L2 interface exposes ioctls for differents actions and I don't
>> > think there's a combined ioctl for this. The request API was
>> > introduced
>> > precisely because we need to have consistency between the various
>> > ioctls
>> > needed for each frame. Maybe one single (atomic) ioctl would have
>> > worked
>> > too, but that's apparently not how the V4L2 API was designed.
>> >
>> > I don't think there is any particular overhead caused by having n
>> > ioctls
>> > instead of a single one. At least that would be very surprising
>> > IMHO.
>>
>> Well, there is small syscall overhead, which normally shouldn't be
>> very painful, although with all the speculative execution hardening,
>> can't be sure of anything anymore. :)
>
> Oh, my mistake then, I had it in mind that it is not really something
> noticeable. Hopefully, it won't be a limiting factor in our cases.

With typical frame rates achievable by hardware codecs, I doubt that
it would be a limiting factor. We're using a similar API (a WiP
version of pre-Request API prototype from long ago) in Chrome OS
already without any performance issues.