Re: [PATCH] drm/gem: Expose the buffer object handle to userspace last

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Mon, 20 Feb 2023 10:23:19 +0000

On 20/02/2023 10:01, Christian König wrote:
Am 20.02.23 um 10:55 schrieb Tvrtko Ursulin:

Hi,

On 14/02/2023 13:59, Christian König wrote:
Am 14.02.23 um 13:50 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Currently drm_gem_handle_create_tail exposes the handle to userspace
before the buffer object constructions is complete. This allowing
of working against a partially constructed object, which may also be in
the process of having its creation fail, can have a range of negative
outcomes.

A lot of those will depend on what the individual drivers are doing in
their obj->funcs->open() callbacks, and also with a common failure mode
being -ENOMEM from drm_vma_node_allow.

We can make sure none of this can happen by allocating a handle last,
although with a downside that more of the function now runs under the
dev->object_name_lock.

Looking into the individual drivers open() hooks, we have
amdgpu_gem_object_open which seems like it could have a potential 
security
issue without this change.

A couple drivers like qxl_gem_object_open and vmw_gem_object_open
implement no-op hooks so no impact for them.

A bunch of other require a deeper look by individual owners to asses 
for
impact. Those are lima_gem_object_open, nouveau_gem_object_open,
panfrost_gem_open, radeon_gem_object_open and 
virtio_gpu_gem_object_open.

Putting aside the risk assesment of the above, some common scenarios to
think about are along these lines:

1)
Userspace closes a handle by speculatively "guessing" it from a second
thread.

This results in an unreachable buffer object so, a memory leak.

2)
Same as 1), but object is in the process of getting closed (failed
creation).

The second thread is then able to re-cycle the handle and idr_remove 
would
in the first thread would then remove the handle it does not own 
from the
idr.

3)
Going back to the earlier per driver problem space - individual impact
assesment of allowing a second thread to access and operate on a 
partially
constructed handle / object. (Can something crash? Leak information?)

In terms of identifying when the problem started I will tag some 
patches
as references, but not all, if even any, of them actually point to a
broken state. I am just identifying points at which more opportunity 
for
issues to arise was added.

Yes I've looked into this once as well, but couldn't completely solve 
it for some reason.

Give me a day or two to get this tested and all the logic swapped 
back into my head again.

Managed to recollect what the problem with earlier attempts was?

Nope, that's way to long ago. I can only assume that I ran into problems 
with the object_name_lock.

Probably best to double check if that doesn't result in a lock inversion 
when somebody grabs the reservation lock in their ->load() callback.

Hmm I don't immediately follow the connection. But I have only found 
radeon_driver_load_kms as using the load callback. Is there any lockdep 
enabled CI for that driver which could tell us if there is a problem there?

Regards,

Tvrtko


Regards,
Christian.


Regards,

Tvrtko

Christian.


References: 304eda32920b ("drm/gem: add hooks to notify driver when 
object handle is created/destroyed")
References: ca481c9b2a3a ("drm/gem: implement vma access management")
References: b39b5394fabc ("drm/gem: Add drm_gem_object_funcs")
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
Cc: Rob Clark <robdclark@xxxxxxxxxxxx>
Cc: Ben Skeggs <bskeggs@xxxxxxxxxx>
Cc: David Herrmann <dh.herrmann@xxxxxxxxx>
Cc: Noralf Trønnes <noralf@xxxxxxxxxxx>
Cc: David Airlie <airlied@xxxxxxxxx>
Cc: Daniel Vetter <daniel@xxxxxxxx>
Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: lima@xxxxxxxxxxxxxxxxxxxxx
Cc: nouveau@xxxxxxxxxxxxxxxxxxxxx
Cc: Steven Price <steven.price@xxxxxxx>
Cc: virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
Cc: spice-devel@xxxxxxxxxxxxxxxxxxxxx
Cc: Zack Rusin <zackr@xxxxxxxxxx>
---
  drivers/gpu/drm/drm_gem.c | 48 
+++++++++++++++++++--------------------
  1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index aa15c52ae182..e3d897bca0f2 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -356,52 +356,52 @@ drm_gem_handle_create_tail(struct drm_file 
*file_priv,
                 u32 *handlep)
  {
      struct drm_device *dev = obj->dev;
-    u32 handle;
      int ret;
WARN_ON(!mutex_is_locked(&dev->object_name_lock));
      if (obj->handle_count++ == 0)
          drm_gem_object_get(obj);
+    ret = drm_vma_node_allow(&obj->vma_node, file_priv);
+    if (ret)
+        goto err_put;
+
+    if (obj->funcs->open) {
+        ret = obj->funcs->open(obj, file_priv);
+        if (ret)
+            goto err_revoke;
+    }
+
      /*
-     * Get the user-visible handle using idr.  Preload and perform
-     * allocation under our spinlock.
+     * Get the user-visible handle using idr as the _last_ step.
+     * Preload and perform allocation under our spinlock.
       */
      idr_preload(GFP_KERNEL);
      spin_lock(&file_priv->table_lock);
-
      ret = idr_alloc(&file_priv->object_idr, obj, 1, 0, GFP_NOWAIT);
-
      spin_unlock(&file_priv->table_lock);
      idr_preload_end();
-    mutex_unlock(&dev->object_name_lock);
      if (ret < 0)
-        goto err_unref;
-
-    handle = ret;
+        goto err_close;
-    ret = drm_vma_node_allow(&obj->vma_node, file_priv);
-    if (ret)
-        goto err_remove;
+    mutex_unlock(&dev->object_name_lock);
-    if (obj->funcs->open) {
-        ret = obj->funcs->open(obj, file_priv);
-        if (ret)
-            goto err_revoke;
-    }
+    *handlep = ret;
-    *handlep = handle;
      return 0;
+err_close:
+    if (obj->funcs->close)
+        obj->funcs->close(obj, file_priv);
  err_revoke:
      drm_vma_node_revoke(&obj->vma_node, file_priv);
-err_remove:
-    spin_lock(&file_priv->table_lock);
-    idr_remove(&file_priv->object_idr, handle);
-    spin_unlock(&file_priv->table_lock);
-err_unref:
-    drm_gem_object_handle_put_unlocked(obj);
+err_put:
+    if (--obj->handle_count == 0)
+        drm_gem_object_put(obj);
+
+    mutex_unlock(&dev->object_name_lock);
+
      return ret;
  }