On Mon, Aug 11, 2014 at 01:38:55PM +0300, Pekka Paalanen wrote: > Hi, > > there is some hardware than can do 2D compositing with an arbitrary > number of planes. I'm not sure what the absolute maximum number of > planes is, but for the discussion, let's say it is 100. > > There are many complicated, dynamic constraints on how many, what size, > etc. planes can be used at once. A driver would be able to check those > before kicking the 2D compositing engine. > > The 2D compositing engine in the best case (only few planes used) is > able to composite on the fly in scanout, just like the usual overlay > hardware blocks in CRTCs. When the composition complexity goes up, the > driver can fall back to compositing into a buffer rather than on the > fly in scanout. This fallback needs to be completely transparent to the > user space, implying only additional latency if anything. > > These 2D compositing features should be exposed to user space through a > standard kernel ABI, hopefully an existing ABI in the very near future > like the KMS atomic. I presume we're talking about the video core from raspi? Or at least something similar? > Assuming the DRM universal planes and atomic mode setting / page flip > infrastructure is in place, could the 2D compositing capabilities be > exposed through universal planes? We can assume that plane properties > are enough to describe all the compositing parameters. > > Atomic updates are needed so that the complicated constraints can be > checked, and user space can try to reduce the composition complexity if > the kernel driver sees that it won't work. > > Would it be feasible to generate a hundred identical non-primary planes > to be exposed to user space via DRM? > > If that could be done, the kernel driver could just use the existing > kernel/user ABIs without having to invent something new, and programs > like a Wayland compositor would not need to be coded specifically for > this hardware. > > What problems do you see with this plan? > Are any of those problems unfixable or simply prohibitive? > > I have some concerns, which I am not sure will actually be a problem: > - Does allocating a 100 planes eat too much kernel memory? > I mean just the bookkeeping, properties, etc. > - Would such an amount of planes make some in-kernel algorithms slow > (particularly in DRM common code)? > - Considering how user space discovers all DRM resources, would this > make a compositor "slow" to start? I don't see any problem with that. We have a few plane-loops, but iirc those can be easily fixed to use indices and similar stuff. The atomic ioctl itself should scale nicely. > I suppose whether these turn out to be prohibitive or not, one just has > to implement it and see. It should be usable on a slowish CPU with > unimpressive amounts of RAM, because that is where a separate 2D > compositing engine gives the most kick. > > FWIW, dynamically created/destroyed planes would probably not be the > answer. The kernel driver cannot decide before-hand how many planes it > can expose. How many planes can be used depends completely on how user > space decides to use them. Therefore I believe it should expose the > maximum number always, whether there is any real use case that could > actually get them all running or not. Yeah dynamic planes doesn't sound like a nice solution, least because you'll get to audit piles of code. Currently really only framebuffers (and to some extent connectors) can come and go freely in kms-land. > What if I cannot even pick a maximum number of planes, but wanted to > (as the hardware allows) let the 2D compositing scale up basically > unlimited while becoming just slower and slower? > > I think at that point one would be looking at a rendering API really, > rather than a KMS API, so it's probably out of scope. Where is the line > between KMS 2D compositing with planes vs. 2D composite rendering? I think kms should still be real-time compositing - if you have to internally render to a buffer and then scan that one out due to lack of memory bandwidth or so that very much sounds like a rendering api. Ofc stuff like writeback buffers blurry that a bit. But hw writeback is still real-time. > Should I really be designing a driver-specific compositing API instead, > similar to what the Mesa OpenGL implementations use? Then have user > space maybe use the user space driver part via OpenWFC perhaps? > And when I mention OpenWFC, you probably notice, that I am not aware of > any standard user space API I could be implementing here. ;-) Personally I'd expose a bunch of planes with kms (enough so that you can reap the usual benefits planes bring wrt video-playback and stuff like that). So perhaps something in line with what current hw does in hw and then double it a bit or twice - 16 planes or so. Your driver would reject any requests that need intermediate buffers to store render results. I.e. everything that can't be scanned out directly in real-time at about 60fps. The fun with kms planes is also that right now we have 0 standards for z-ordering and blending. So would need to define that first. Then expose everything else with a separate api. I guess you'll just end up with per-compositor userspace drivers due to the lack of a widespread 2d api. OpenVG is kinda dead, and cairo might not fit. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel