On Wednesday, November 29th, 2023 at 13:45, Maxime Ripard <mripard@xxxxxxxxxx> wrote: > > > Hi, > > Thanks for writing this down > > On Thu, Nov 16, 2023 at 03:53:20PM +0000, Simon Ser wrote: > > > On Thursday, November 9th, 2023 at 08:45, Simon Ser contact@xxxxxxxxxxx wrote: > > > > > User-space sometimes needs to allocate scanout-capable memory for > > > GPU rendering purposes. On a vc4/v3d split render/display SoC, this > > > is achieved via DRM dumb buffers: the v3d user-space driver opens > > > the primary vc4 node, allocates a DRM dumb buffer there, exports it > > > as a DMA-BUF, imports it into the v3d render node, and renders to it. > > > > > > However, DRM dumb buffers are only meant for CPU rendering, they are > > > not intended to be used for GPU rendering. Primary nodes should only > > > be used for mode-setting purposes, other programs should not attempt > > > to open it. Moreover, opening the primary node is already broken on > > > some setups: systemd grants permission to open primary nodes to > > > physically logged in users, but this breaks when the user is not > > > physically logged in (e.g. headless setup) and when the distribution > > > is using a different init (e.g. Alpine Linux uses openrc). > > > > > > We need an alternate way for v3d to allocate scanout-capable memory. > > > Leverage DMA heaps for this purpose: expose a CMA heap to user-space. > > > > So we've discussed about this patch on IRC [1] [2]. Some random notes: > > > > - We shouldn't create per-DRM-device heaps in general. Instead, we should try > > using centralized heaps like the existing system and cma ones. That way other > > drivers (video, render, etc) can also link to these heaps without depending > > on the display driver. > > - We can't generically link to heaps in core DRM, however we probably provide > > a default for shmem and cma helpers. > > - We're missing a bunch of heaps, e.g. sometimes there are multiple cma areas > > but only a single cma heap is created right now. > > - Some hw needs the memory to be in a specific region for scanout (e.g. lower > > 256MB of RAM for Allwinner). We could create one heap per such region (but is > > it fine to have overlapping heaps?). > > Just for reference, it's not the scanout itself that has that > requirement on Allwinner SoCs, it's the HW codec. But if you want to > display the decoded frame directly using dma-buf, you'll still need to > either allocate a scanout buffer and hope it'll be in the lower 256MB, > or allocate a buffer from the codec in the lower 256MB and then hope > it's scanout-capable (which it is, so that's we do, but there's no > guarantee about it). OK. Yeah, the problem remains. > I think the logicvc is a much better example for this, since it requires > framebuffers to be in a specific area, with each plane having a > dedicated area. > > AFAIK that's the most extreme example we have upstream. That kind of restriction is not supported by generic user-space. As far as user-space is concerned, scanout-capable buffers aren't tied to any plane in particular. Generic user-space allocates via GBM or dumb buffers, and at allocation time there is no hint about the plane the buffer will be attached to. I'm personally not super excited/interested about supporting this kind of weird setup which doesn't match the KMS uAPI.