Re: [PATCH 1/3] dma_resv: prime lockdep annotations

Thomas Hellström (VMware) <thomas_os@xxxxxxxxxxxx> · Thu, 22 Aug 2019 08:42:41 +0200

On 8/21/19 9:51 PM, Daniel Vetter wrote:
On Wed, Aug 21, 2019 at 08:27:59PM +0200, Thomas Hellström (VMware) wrote:
On 8/21/19 8:11 PM, Daniel Vetter wrote:
On Wed, Aug 21, 2019 at 7:06 PM Thomas Hellström (VMware)
<thomas_os@xxxxxxxxxxxx> wrote:
On 8/21/19 6:34 PM, Daniel Vetter wrote:
On Wed, Aug 21, 2019 at 05:54:27PM +0200, Thomas Hellström (VMware) wrote:
On 8/20/19 4:53 PM, Daniel Vetter wrote:
Full audit of everyone:

- i915, radeon, amdgpu should be clean per their maintainers.

- vram helpers should be fine, they don't do command submission, so
      really no business holding struct_mutex while doing copy_*_user. But
      I haven't checked them all.

- panfrost seems to dma_resv_lock only in panfrost_job_push, which
      looks clean.

- v3d holds dma_resv locks in the tail of its v3d_submit_cl_ioctl(),
      copying from/to userspace happens all in v3d_lookup_bos which is
      outside of the critical section.

- vmwgfx has a bunch of ioctls that do their own copy_*_user:
      - vmw_execbuf_process: First this does some copies in
        vmw_execbuf_cmdbuf() and also in the vmw_execbuf_process() itself.
        Then comes the usual ttm reserve/validate sequence, then actual
        submission/fencing, then unreserving, and finally some more
        copy_to_user in vmw_execbuf_copy_fence_user. Glossing over tons of
        details, but looks all safe.
      - vmw_fence_event_ioctl: No ttm_reserve/dma_resv_lock anywhere to be
        seen, seems to only create a fence and copy it out.
      - a pile of smaller ioctl in vmwgfx_ioctl.c, no reservations to be
        found there.
      Summary: vmwgfx seems to be fine too.

- virtio: There's virtio_gpu_execbuffer_ioctl, which does all the
      copying from userspace before even looking up objects through their
      handles, so safe. Plus the getparam/getcaps ioctl, also both safe.

- qxl only has qxl_execbuffer_ioctl, which calls into
      qxl_process_single_command. There's a lovely comment before the
      __copy_from_user_inatomic that the slowpath should be copied from
      i915, but I guess that never happened. Try not to be unlucky and get
      your CS data evicted between when it's written and the kernel tries
      to read it. The only other copy_from_user is for relocs, but those
      are done before qxl_release_reserve_list(), which seems to be the
      only thing reserving buffers (in the ttm/dma_resv sense) in that
      code. So looks safe.

- A debugfs file in nouveau_debugfs_pstate_set() and the usif ioctl in
      usif_ioctl() look safe. nouveau_gem_ioctl_pushbuf() otoh breaks this
      everywhere and needs to be fixed up.

Cc: Alex Deucher <alexander.deucher@xxxxxxx>
Cc: Christian König <christian.koenig@xxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Thomas Zimmermann <tzimmermann@xxxxxxx>
Cc: Rob Herring <robh@xxxxxxxxxx>
Cc: Tomeu Vizoso <tomeu.vizoso@xxxxxxxxxxxxx>
Cc: Eric Anholt <eric@xxxxxxxxxx>
Cc: Dave Airlie <airlied@xxxxxxxxxx>
Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx>
Cc: Ben Skeggs <bskeggs@xxxxxxxxxx>
Cc: "VMware Graphics" <linux-graphics-maintainer@xxxxxxxxxx>
Cc: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
---
     drivers/dma-buf/dma-resv.c | 12 ++++++++++++
     1 file changed, 12 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 42a8f3f11681..3edca10d3faf 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -34,6 +34,7 @@
     #include <linux/dma-resv.h>
     #include <linux/export.h>
+#include <linux/sched/mm.h>
     /**
      * DOC: Reservation Object Overview
@@ -107,6 +108,17 @@ void dma_resv_init(struct dma_resv *obj)
                      &reservation_seqcount_class);
      RCU_INIT_POINTER(obj->fence, NULL);
      RCU_INIT_POINTER(obj->fence_excl, NULL);
+
+   if (IS_ENABLED(CONFIG_LOCKDEP)) {
+           if (current->mm)
+                   down_read(&current->mm->mmap_sem);
+           ww_mutex_lock(&obj->lock, NULL);
+           fs_reclaim_acquire(GFP_KERNEL);
+           fs_reclaim_release(GFP_KERNEL);
+           ww_mutex_unlock(&obj->lock);
+           if (current->mm)
+                   up_read(&current->mm->mmap_sem);
+   }
     }
     EXPORT_SYMBOL(dma_resv_init);
I assume if this would have been easily done and maintainable using only
lockdep annotation instead of actually acquiring the locks, that would have
been done?
There's might_lock(), plus a pile of macros, but they don't map obviuosly,
so pretty good chances I accidentally end up with the wrong type of
annotation. Easier to just take the locks quickly, and stuff that all into
a lockdep-only section to avoid overhead.

Otherwise LGTM.

Reviewed-by: Thomas Hellström <thellstrom@xxxxxxxxxx>

Will test this and let you know if it trips on vmwgfx, but it really
shouldn't.
Thanks, Daniel
One thing that strikes me is that this puts restrictions on where you
can actually initialize a dma_resv, even if locking orders are otherwise
obeyed. But that might not be a big problem.
Hm yeah ... the trouble is a need a non-kthread thread so that I have
a current->mm. Otherwise I'd have put it into some init section with a
temp dma_buf. And I kinda don't want to create a fake ->mm just for
lockdep priming. I don't expect this to be a real problem in practice,
since before you've called dma_resv_init the reservation lock doesn't
exist, so you can't hold it. And you've probably just allocated it, so
fs_reclaim is going to be fine. And if you allocate dma_resv objects
from your fault handlers I have questions anyway :-)
Coming to think of it, I think vmwgfx sometimes create bos with other bo's
reservation lock held. I guess that would trip both the mmap_sem check the
ww_mutex check?
If you do that, yes we're busted. Do you do that?

Yes, we do, in a couple of places it seems, and it also appears like TTM 
is doing it according to Christian.


I guess needs a new idea for where to put this ... while making sure
everyone gets it. So some evil trick like putting it in drm_open() won't
work, since I also want to make sure everyone else using dma-buf follows
these rules.

IMO it should be sufficient to establish this locking order once, but I 
guess dma-buf module init time won't work because we might not have an 
mm structure?

/Thomas


Ideas?
-Daniel


_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx