On Tue, May 21, 2019 at 8:45 PM Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> wrote: > > Hi, > > On Tue, 2019-05-21 at 19:27 +0900, Tomasz Figa wrote: > > On Thu, May 16, 2019 at 2:43 AM Paul Kocialkowski > > <paul.kocialkowski@xxxxxxxxxxx> wrote: > > > Hi, > > > > > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit : > > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit : > > > > > Hi, > > > > > > > > > > With the Rockchip stateless VPU driver in the works, we now have a > > > > > better idea of what the situation is like on platforms other than > > > > > Allwinner. This email shares my conclusions about the situation and how > > > > > we should update the MPEG-2, H.264 and H.265 controls accordingly. > > > > > > > > > > - Per-slice decoding > > > > > > > > > > We've discussed this one already[0] and Hans has submitted a patch[1] > > > > > to implement the required core bits. When we agree it looks good, we > > > > > should lift the restriction that all slices must be concatenated and > > > > > have them submitted as individual requests. > > > > > > > > > > One question is what to do about other controls. I feel like it would > > > > > make sense to always pass all the required controls for decoding the > > > > > slice, including the ones that don't change across slices. But there > > > > > may be no particular advantage to this and only downsides. Not doing it > > > > > and relying on the "control cache" can work, but we need to specify > > > > > that only a single stream can be decoded per opened instance of the > > > > > v4l2 device. This is the assumption we're going with for handling > > > > > multi-slice anyway, so it shouldn't be an issue. > > > > > > > > My opinion on this is that the m2m instance is a state, and the driver > > > > should be responsible of doing time-division multiplexing across > > > > multiple m2m instance jobs. Doing the time-division multiplexing in > > > > userspace would require some sort of daemon to work properly across > > > > processes. I also think the kernel is better place for doing resource > > > > access scheduling in general. > > > > > > I agree with that yes. We always have a single m2m context and specific > > > controls per opened device so keeping cached values works out well. > > > > > > So maybe we shall explicitly require that the request with the first > > > slice for a frame also contains the per-frame controls. > > > > > > > Agreed. > > > > One more argument not to allow such multiplexing is that despite the ^^ Here I meant the "userspace multiplexing". > > API being called "stateless", there is actually some state saved > > between frames, e.g. the Rockchip decoder writes some intermediate > > data to some local buffers which need to be given to the decoder to > > decode the next frame. Actually, on Rockchip there is even a > > requirement to keep the reference list entries in the same order > > between frames. > > Well, what I'm suggesting is to have one stream per m2m context, but it > should certainly be possible to have multiple m2m contexts (multiple > userspace open calls) that decode different streams concurrently. > > Is that really going to be a problem for Rockchip? If so, then the > driver should probably enforce allowing a single userspace open and m2m > context at a time. No, that's not what I meant. Obviously the driver can switch between different sets of private buffers when scheduling different contexts, as long as the userspace doesn't attempt to do any multiplexing itself. Best regards, Tomasz