Re: [PATCH v8 01/14] drm: Define histogram structures exposed to user

"Murthy, Arun R" <arun.r.murthy@xxxxxxxxx> · Tue, 18 Feb 2025 11:31:42 +0530

On 17-02-2025 22:56, Simona Vetter wrote:
On Mon, Feb 17, 2025 at 12:08:08PM +0200, Pekka Paalanen wrote:
Hi Arun,

this whole series seems to be missing all the UAPI docs for the DRM
ReST files, e.g. drm-kms.rst. The UAPI header doc comments are not a
replacement for them, I would assume both are a requirement.

Without the ReST docs it is really difficult to see how this new UAPI
should be used.
Seconded. But really only wanted to comment on the userspace address in
drm blobs.

+/**
+ * struct drm_histogram_config
+ *
+ * @hist_mode_data: address to the histogram mode specific data if any
Do I understand correctly that the KMS blob will contain a userspace
virtual memory address (a user pointer)? How does that work? What are
the lifetime requirements for that memory?

I do not remember any precedent of this, and I suspect it's not a good
design. I believe all the data should be contained in the blobs, e.g.
how IN_FORMATS does it. I'm not sure what would be the best UAPI here
for returning histogram data to userspace, but at least all the data
sent to the kernel should be contained in the blob itself since it
seems to be quite small. Variable length is ok for blobs.
So yeah this doesn't work for a few reasons:

- It's very restrictive what you're allowed to do during an atomic kms
   commit, and a userspace page fault due to copy_from/to_user is
   definitely not ok. Which means you need to unconditionally copy before
   the atomic commit in the synchronous prep phase for the user->kernel
   direction, and somewhere after the entire thing has finished for the
   other direction. So this is worse than just more blobs, because with
   drm blobs you can at least avoid copying if nothing has changed.

- Due to the above you also cannot synchronize with userspace for the
   kernel->userspace copy. And you can't fix that with a sync_file out
   fence, because the underlying dma_fence rules are what prevents you from
   doing userspace page faults in atomic commit, and the same rules apply
   for any other sync_file fence too.

- More fundamentally, both drm blobs and userspace virtual address spaces
   (as represented by struct mm_struct) are refconted objects, with
   entirely decoupled lifetimes. You'll have UAF issues here, and if you
   fix them by grabbing references you'll break the world.

tldr; this does not work

Alternative A: drm blob
-----------------------

This would work for the userspace->kernel direction, but there's some
downsides:

- You still copy, although less often than with a userspace pointer.

- The kernel->userspace direction doesn't work, because blob objects are
   immutable. We have mutable blob properties, but mutability is achieved
   by exchanging the entire blob object. There's two options to address
   that:

   a) Fundamentally immutable objects is really nice api designs, so I
      prefer to not change that. But in theory making blob objects mutable
      would work, and probably break the world.

   b) A more benign trick would be to split the blob object id allocation
      from creating the object itself. We could then allocate and return
      the blob ID of the new histogram to userspace synchronously from the
      atomic ioctl, while creating the object for real only in the atomic
      commit.

      As long as we preallocate any memory this doesn't break and dma_fence
      signalling rules. Which also means we could use the existing atomic
      out-fence (or a new one for histograms) to signal to userspace when
      the data is ready, so this is at least somewhat useful for
      compositors without fundamental issues.

      You still suffer from additional copies here.

Alternative B: gem_bo
---------------------

One alternative which naturally has mutable data would be gem_bo, maybe
wrapped in a drm_fb. The issue with that is that for small histograms you
really want cpu access both in userspace and the kernel, while most
display hardware wants uncached. And all the display-only kms drivers we
have do not have a concept of cached gem_bo, unlike many of the drm
drivers with render/accel support. Which means we're adding gem_bo which
cannot be used for display, on display-only drivers, and I'd expect this
will result in compositors blowing up in funny ways to no end.

So not a good idea either, at least not if your histograms are small and
the display hw doesn't dma them in/out already anyway.

This also means that we'll probably need 2 interfaces here, one supporting
gem_bo for big histograms and hw that can dma in/out of them, and a 2nd
one optimized for the cpu access case.

Alternative C: memfd
--------------------

I think a new drm property type that accepts memfd would fit the bill
quit well:

- memfd can be mmap(), so you avoid copies.

- their distinct from gem_bo, so no chaos in apis everywhere with imposter
   gem_bo that cannot ever be used for display.

- memfd can be sealed, so we can validate that they have the right size

- thanks to umdabuf there's already core mm code to properly pin them, so
   painful to implement this all.

For a driver interface I think the memfd should be pinned as long as it's
in a drm_crtc/plane/whatever_state structure, with a kernel vmap void *
pointer already set up. That way drivers can't get this wrong.

The uapi has a few options:

- Allow memfd to back drm_framebuffer. This won't result in api chaos
   since the compositor creates these, and these memfd should never show up
   in any property that would have a real fb backed by gem_bo. This still
   feels horrible to me personally, but it would allow to support
   histograms that need gem_bo in the same api. Personally I think we
   should just do two flavors, they're too distinct.

- A new memfd kms object like blob objects, which you can create and
   destroy and which are refcounted. Creation would also pin the memfd and
   check it has a sealed size (and whatever else we want sealed). This
   avoids pin/unpin every time you change the memfd property, but no idea
   whether that's a real use-case.

- memfd properties just get the file descriptor (like in/out fences do)
   and the drm atomic ioctl layer transparently pins/unpins as needed.

Personally I think option C is neat, A doable, B really only for hw that
can dma in/out of histograms and where it's big enough that doing so is a
functional requirement.

Cheers, Sima
Thanks for the detailed exploration of the options available and the 
conclusion among the available options.
Bringing memfd as a drm object opens up new opportunity for the drm 
users and a very good thought. Just curious to know if will histogram be 
the only user for this or does new IPC open up the thoughts for other 
interfaces such as writeback etc

I also personally feel bringing this memfd to drm is a good approach, 
will try to explore on the design part.
Any other comments/opinions on this from anyone?

Thanks and Regards,
Arun R Murthy
--------------------