Re: [PATCH 1/3] dma_resv: prime lockdep annotations

Daniel Vetter <daniel@xxxxxxxx> · Thu, 22 Aug 2019 08:47:55 +0200



On Thu, Aug 22, 2019 at 8:42 AM Thomas Hellström (VMware)
<thomas_os@xxxxxxxxxxxx> wrote:
>
> On 8/21/19 9:51 PM, Daniel Vetter wrote:
> > On Wed, Aug 21, 2019 at 08:27:59PM +0200, Thomas Hellström (VMware) wrote:
> >> On 8/21/19 8:11 PM, Daniel Vetter wrote:
> >>> On Wed, Aug 21, 2019 at 7:06 PM Thomas Hellström (VMware)
> >>> <thomas_os@xxxxxxxxxxxx> wrote:
> >>>> On 8/21/19 6:34 PM, Daniel Vetter wrote:
> >>>>> On Wed, Aug 21, 2019 at 05:54:27PM +0200, Thomas Hellström (VMware) wrote:
> >>>>>> On 8/20/19 4:53 PM, Daniel Vetter wrote:
> >>>>>>> Full audit of everyone:
> >>>>>>>
> >>>>>>> - i915, radeon, amdgpu should be clean per their maintainers.
> >>>>>>>
> >>>>>>> - vram helpers should be fine, they don't do command submission, so
> >>>>>>>       really no business holding struct_mutex while doing copy_*_user. But
> >>>>>>>       I haven't checked them all.
> >>>>>>>
> >>>>>>> - panfrost seems to dma_resv_lock only in panfrost_job_push, which
> >>>>>>>       looks clean.
> >>>>>>>
> >>>>>>> - v3d holds dma_resv locks in the tail of its v3d_submit_cl_ioctl(),
> >>>>>>>       copying from/to userspace happens all in v3d_lookup_bos which is
> >>>>>>>       outside of the critical section.
> >>>>>>>
> >>>>>>> - vmwgfx has a bunch of ioctls that do their own copy_*_user:
> >>>>>>>       - vmw_execbuf_process: First this does some copies in
> >>>>>>>         vmw_execbuf_cmdbuf() and also in the vmw_execbuf_process() itself.
> >>>>>>>         Then comes the usual ttm reserve/validate sequence, then actual
> >>>>>>>         submission/fencing, then unreserving, and finally some more
> >>>>>>>         copy_to_user in vmw_execbuf_copy_fence_user. Glossing over tons of
> >>>>>>>         details, but looks all safe.
> >>>>>>>       - vmw_fence_event_ioctl: No ttm_reserve/dma_resv_lock anywhere to be
> >>>>>>>         seen, seems to only create a fence and copy it out.
> >>>>>>>       - a pile of smaller ioctl in vmwgfx_ioctl.c, no reservations to be
> >>>>>>>         found there.
> >>>>>>>       Summary: vmwgfx seems to be fine too.
> >>>>>>>
> >>>>>>> - virtio: There's virtio_gpu_execbuffer_ioctl, which does all the
> >>>>>>>       copying from userspace before even looking up objects through their
> >>>>>>>       handles, so safe. Plus the getparam/getcaps ioctl, also both safe.
> >>>>>>>
> >>>>>>> - qxl only has qxl_execbuffer_ioctl, which calls into
> >>>>>>>       qxl_process_single_command. There's a lovely comment before the
> >>>>>>>       __copy_from_user_inatomic that the slowpath should be copied from
> >>>>>>>       i915, but I guess that never happened. Try not to be unlucky and get
> >>>>>>>       your CS data evicted between when it's written and the kernel tries
> >>>>>>>       to read it. The only other copy_from_user is for relocs, but those
> >>>>>>>       are done before qxl_release_reserve_list(), which seems to be the
> >>>>>>>       only thing reserving buffers (in the ttm/dma_resv sense) in that
> >>>>>>>       code. So looks safe.
> >>>>>>>
> >>>>>>> - A debugfs file in nouveau_debugfs_pstate_set() and the usif ioctl in
> >>>>>>>       usif_ioctl() look safe. nouveau_gem_ioctl_pushbuf() otoh breaks this
> >>>>>>>       everywhere and needs to be fixed up.
> >>>>>>>
> >>>>>>> Cc: Alex Deucher <alexander.deucher@xxxxxxx>
> >>>>>>> Cc: Christian König <christian.koenig@xxxxxxx>
> >>>>>>> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> >>>>>>> Cc: Thomas Zimmermann <tzimmermann@xxxxxxx>
> >>>>>>> Cc: Rob Herring <robh@xxxxxxxxxx>
> >>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@xxxxxxxxxxxxx>
> >>>>>>> Cc: Eric Anholt <eric@xxxxxxxxxx>
> >>>>>>> Cc: Dave Airlie <airlied@xxxxxxxxxx>
> >>>>>>> Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx>
> >>>>>>> Cc: Ben Skeggs <bskeggs@xxxxxxxxxx>
> >>>>>>> Cc: "VMware Graphics" <linux-graphics-maintainer@xxxxxxxxxx>
> >>>>>>> Cc: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
> >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> >>>>>>> ---
> >>>>>>>      drivers/dma-buf/dma-resv.c | 12 ++++++++++++
> >>>>>>>      1 file changed, 12 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> >>>>>>> index 42a8f3f11681..3edca10d3faf 100644
> >>>>>>> --- a/drivers/dma-buf/dma-resv.c
> >>>>>>> +++ b/drivers/dma-buf/dma-resv.c
> >>>>>>> @@ -34,6 +34,7 @@
> >>>>>>>      #include <linux/dma-resv.h>
> >>>>>>>      #include <linux/export.h>
> >>>>>>> +#include <linux/sched/mm.h>
> >>>>>>>      /**
> >>>>>>>       * DOC: Reservation Object Overview
> >>>>>>> @@ -107,6 +108,17 @@ void dma_resv_init(struct dma_resv *obj)
> >>>>>>>                       &reservation_seqcount_class);
> >>>>>>>       RCU_INIT_POINTER(obj->fence, NULL);
> >>>>>>>       RCU_INIT_POINTER(obj->fence_excl, NULL);
> >>>>>>> +
> >>>>>>> +   if (IS_ENABLED(CONFIG_LOCKDEP)) {
> >>>>>>> +           if (current->mm)
> >>>>>>> +                   down_read(&current->mm->mmap_sem);
> >>>>>>> +           ww_mutex_lock(&obj->lock, NULL);
> >>>>>>> +           fs_reclaim_acquire(GFP_KERNEL);
> >>>>>>> +           fs_reclaim_release(GFP_KERNEL);
> >>>>>>> +           ww_mutex_unlock(&obj->lock);
> >>>>>>> +           if (current->mm)
> >>>>>>> +                   up_read(&current->mm->mmap_sem);
> >>>>>>> +   }
> >>>>>>>      }
> >>>>>>>      EXPORT_SYMBOL(dma_resv_init);
> >>>>>> I assume if this would have been easily done and maintainable using only
> >>>>>> lockdep annotation instead of actually acquiring the locks, that would have
> >>>>>> been done?
> >>>>> There's might_lock(), plus a pile of macros, but they don't map obviuosly,
> >>>>> so pretty good chances I accidentally end up with the wrong type of
> >>>>> annotation. Easier to just take the locks quickly, and stuff that all into
> >>>>> a lockdep-only section to avoid overhead.
> >>>>>
> >>>>>> Otherwise LGTM.
> >>>>>>
> >>>>>> Reviewed-by: Thomas Hellström <thellstrom@xxxxxxxxxx>
> >>>>>>
> >>>>>> Will test this and let you know if it trips on vmwgfx, but it really
> >>>>>> shouldn't.
> >>>>> Thanks, Daniel
> >>>> One thing that strikes me is that this puts restrictions on where you
> >>>> can actually initialize a dma_resv, even if locking orders are otherwise
> >>>> obeyed. But that might not be a big problem.
> >>> Hm yeah ... the trouble is a need a non-kthread thread so that I have
> >>> a current->mm. Otherwise I'd have put it into some init section with a
> >>> temp dma_buf. And I kinda don't want to create a fake ->mm just for
> >>> lockdep priming. I don't expect this to be a real problem in practice,
> >>> since before you've called dma_resv_init the reservation lock doesn't
> >>> exist, so you can't hold it. And you've probably just allocated it, so
> >>> fs_reclaim is going to be fine. And if you allocate dma_resv objects
> >>> from your fault handlers I have questions anyway :-)
> >> Coming to think of it, I think vmwgfx sometimes create bos with other bo's
> >> reservation lock held. I guess that would trip both the mmap_sem check the
> >> ww_mutex check?
> > If you do that, yes we're busted. Do you do that?
>
> Yes, we do, in a couple of places it seems, and it also appears like TTM
> is doing it according to Christian.
>
> >
> > I guess needs a new idea for where to put this ... while making sure
> > everyone gets it. So some evil trick like putting it in drm_open() won't
> > work, since I also want to make sure everyone else using dma-buf follows
> > these rules.
>
> IMO it should be sufficient to establish this locking order once, but I
> guess dma-buf module init time won't work because we might not have an
> mm structure?

mm_alloc() is a thing as Chris pointed out, and it works. v3 on its way.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx