On Mon, Feb 17, 2025 at 12:08:08PM +0200, Pekka Paalanen wrote: > Hi Arun, > > this whole series seems to be missing all the UAPI docs for the DRM > ReST files, e.g. drm-kms.rst. The UAPI header doc comments are not a > replacement for them, I would assume both are a requirement. > > Without the ReST docs it is really difficult to see how this new UAPI > should be used. Seconded. But really only wanted to comment on the userspace address in drm blobs. > > +/** > > + * struct drm_histogram_config > > + * > > + * @hist_mode_data: address to the histogram mode specific data if any > > Do I understand correctly that the KMS blob will contain a userspace > virtual memory address (a user pointer)? How does that work? What are > the lifetime requirements for that memory? > > I do not remember any precedent of this, and I suspect it's not a good > design. I believe all the data should be contained in the blobs, e.g. > how IN_FORMATS does it. I'm not sure what would be the best UAPI here > for returning histogram data to userspace, but at least all the data > sent to the kernel should be contained in the blob itself since it > seems to be quite small. Variable length is ok for blobs. So yeah this doesn't work for a few reasons: - It's very restrictive what you're allowed to do during an atomic kms commit, and a userspace page fault due to copy_from/to_user is definitely not ok. Which means you need to unconditionally copy before the atomic commit in the synchronous prep phase for the user->kernel direction, and somewhere after the entire thing has finished for the other direction. So this is worse than just more blobs, because with drm blobs you can at least avoid copying if nothing has changed. - Due to the above you also cannot synchronize with userspace for the kernel->userspace copy. And you can't fix that with a sync_file out fence, because the underlying dma_fence rules are what prevents you from doing userspace page faults in atomic commit, and the same rules apply for any other sync_file fence too. - More fundamentally, both drm blobs and userspace virtual address spaces (as represented by struct mm_struct) are refconted objects, with entirely decoupled lifetimes. You'll have UAF issues here, and if you fix them by grabbing references you'll break the world. tldr; this does not work Alternative A: drm blob ----------------------- This would work for the userspace->kernel direction, but there's some downsides: - You still copy, although less often than with a userspace pointer. - The kernel->userspace direction doesn't work, because blob objects are immutable. We have mutable blob properties, but mutability is achieved by exchanging the entire blob object. There's two options to address that: a) Fundamentally immutable objects is really nice api designs, so I prefer to not change that. But in theory making blob objects mutable would work, and probably break the world. b) A more benign trick would be to split the blob object id allocation from creating the object itself. We could then allocate and return the blob ID of the new histogram to userspace synchronously from the atomic ioctl, while creating the object for real only in the atomic commit. As long as we preallocate any memory this doesn't break and dma_fence signalling rules. Which also means we could use the existing atomic out-fence (or a new one for histograms) to signal to userspace when the data is ready, so this is at least somewhat useful for compositors without fundamental issues. You still suffer from additional copies here. Alternative B: gem_bo --------------------- One alternative which naturally has mutable data would be gem_bo, maybe wrapped in a drm_fb. The issue with that is that for small histograms you really want cpu access both in userspace and the kernel, while most display hardware wants uncached. And all the display-only kms drivers we have do not have a concept of cached gem_bo, unlike many of the drm drivers with render/accel support. Which means we're adding gem_bo which cannot be used for display, on display-only drivers, and I'd expect this will result in compositors blowing up in funny ways to no end. So not a good idea either, at least not if your histograms are small and the display hw doesn't dma them in/out already anyway. This also means that we'll probably need 2 interfaces here, one supporting gem_bo for big histograms and hw that can dma in/out of them, and a 2nd one optimized for the cpu access case. Alternative C: memfd -------------------- I think a new drm property type that accepts memfd would fit the bill quit well: - memfd can be mmap(), so you avoid copies. - their distinct from gem_bo, so no chaos in apis everywhere with imposter gem_bo that cannot ever be used for display. - memfd can be sealed, so we can validate that they have the right size - thanks to umdabuf there's already core mm code to properly pin them, so painful to implement this all. For a driver interface I think the memfd should be pinned as long as it's in a drm_crtc/plane/whatever_state structure, with a kernel vmap void * pointer already set up. That way drivers can't get this wrong. The uapi has a few options: - Allow memfd to back drm_framebuffer. This won't result in api chaos since the compositor creates these, and these memfd should never show up in any property that would have a real fb backed by gem_bo. This still feels horrible to me personally, but it would allow to support histograms that need gem_bo in the same api. Personally I think we should just do two flavors, they're too distinct. - A new memfd kms object like blob objects, which you can create and destroy and which are refcounted. Creation would also pin the memfd and check it has a sealed size (and whatever else we want sealed). This avoids pin/unpin every time you change the memfd property, but no idea whether that's a real use-case. - memfd properties just get the file descriptor (like in/out fences do) and the drm atomic ioctl layer transparently pins/unpins as needed. Personally I think option C is neat, A doable, B really only for hw that can dma in/out of histograms and where it's big enough that doing so is a functional requirement. Cheers, Sima -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch