Hi, there is some hardware than can do 2D compositing with an arbitrary number of planes. I'm not sure what the absolute maximum number of planes is, but for the discussion, let's say it is 100. There are many complicated, dynamic constraints on how many, what size, etc. planes can be used at once. A driver would be able to check those before kicking the 2D compositing engine. The 2D compositing engine in the best case (only few planes used) is able to composite on the fly in scanout, just like the usual overlay hardware blocks in CRTCs. When the composition complexity goes up, the driver can fall back to compositing into a buffer rather than on the fly in scanout. This fallback needs to be completely transparent to the user space, implying only additional latency if anything. These 2D compositing features should be exposed to user space through a standard kernel ABI, hopefully an existing ABI in the very near future like the KMS atomic. Assuming the DRM universal planes and atomic mode setting / page flip infrastructure is in place, could the 2D compositing capabilities be exposed through universal planes? We can assume that plane properties are enough to describe all the compositing parameters. Atomic updates are needed so that the complicated constraints can be checked, and user space can try to reduce the composition complexity if the kernel driver sees that it won't work. Would it be feasible to generate a hundred identical non-primary planes to be exposed to user space via DRM? If that could be done, the kernel driver could just use the existing kernel/user ABIs without having to invent something new, and programs like a Wayland compositor would not need to be coded specifically for this hardware. What problems do you see with this plan? Are any of those problems unfixable or simply prohibitive? I have some concerns, which I am not sure will actually be a problem: - Does allocating a 100 planes eat too much kernel memory? I mean just the bookkeeping, properties, etc. - Would such an amount of planes make some in-kernel algorithms slow (particularly in DRM common code)? - Considering how user space discovers all DRM resources, would this make a compositor "slow" to start? I suppose whether these turn out to be prohibitive or not, one just has to implement it and see. It should be usable on a slowish CPU with unimpressive amounts of RAM, because that is where a separate 2D compositing engine gives the most kick. FWIW, dynamically created/destroyed planes would probably not be the answer. The kernel driver cannot decide before-hand how many planes it can expose. How many planes can be used depends completely on how user space decides to use them. Therefore I believe it should expose the maximum number always, whether there is any real use case that could actually get them all running or not. What if I cannot even pick a maximum number of planes, but wanted to (as the hardware allows) let the 2D compositing scale up basically unlimited while becoming just slower and slower? I think at that point one would be looking at a rendering API really, rather than a KMS API, so it's probably out of scope. Where is the line between KMS 2D compositing with planes vs. 2D composite rendering? Should I really be designing a driver-specific compositing API instead, similar to what the Mesa OpenGL implementations use? Then have user space maybe use the user space driver part via OpenWFC perhaps? And when I mention OpenWFC, you probably notice, that I am not aware of any standard user space API I could be implementing here. ;-) Thanks, pq _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel