Re: [PATCH 3/3] drm/ttm: remove ttm_bo_wait_unreserved

Daniel Vetter <daniel.vetter@xxxxxxxx> · Wed, 21 Aug 2019 17:14:35 +0200

On Wed, Aug 21, 2019 at 5:03 PM Thomas Hellström (VMware)
<thomas_os@xxxxxxxxxxxx> wrote:
> On 8/21/19 4:47 PM, Daniel Vetter wrote:
> > On Wed, Aug 21, 2019 at 4:27 PM Thomas Hellström (VMware)
> > <thomas_os@xxxxxxxxxxxx> wrote:
> >> On 8/21/19 4:09 PM, Daniel Vetter wrote:
> >>> On Wed, Aug 21, 2019 at 2:47 PM Thomas Hellström (VMware)
> >>> <thomas_os@xxxxxxxxxxxx> wrote:
> >>>> On 8/21/19 2:40 PM, Thomas Hellström (VMware) wrote:
> >>>>> On 8/20/19 4:53 PM, Daniel Vetter wrote:
> >>>>>> With nouveau fixed all ttm-using drives have the correct nesting of
> >>>>>> mmap_sem vs dma_resv, and we can just lock the buffer.
> >>>>>>
> >>>>>> Assuming I didn't screw up anything with my audit of course.
> >>>>>>
> >>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> >>>>>> Cc: Christian Koenig <christian.koenig@xxxxxxx>
> >>>>>> Cc: Huang Rui <ray.huang@xxxxxxx>
> >>>>>> Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx>
> >>>>>> Cc: "VMware Graphics" <linux-graphics-maintainer@xxxxxxxxxx>
> >>>>>> Cc: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
> >>>>>> ---
> >>>>>>     drivers/gpu/drm/ttm/ttm_bo.c    | 34 ---------------------------------
> >>>>>>     drivers/gpu/drm/ttm/ttm_bo_vm.c | 26 +------------------------
> >>>>>>     include/drm/ttm/ttm_bo_api.h    |  1 -
> >>>>>>     3 files changed, 1 insertion(+), 60 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>> index 20ff56f27aa4..a952dd624b06 100644
> >>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>> @@ -1954,37 +1954,3 @@ void ttm_bo_swapout_all(struct ttm_bo_device
> >>>>>> *bdev)
> >>>>>>             ;
> >>>>>>     }
> >>>>>>     EXPORT_SYMBOL(ttm_bo_swapout_all);
> >>>>>> -
> >>>>>> -/**
> >>>>>> - * ttm_bo_wait_unreserved - interruptible wait for a buffer object
> >>>>>> to become
> >>>>>> - * unreserved
> >>>>>> - *
> >>>>>> - * @bo: Pointer to buffer
> >>>>>> - */
> >>>>>> -int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo)
> >>>>>> -{
> >>>>>> -    int ret;
> >>>>>> -
> >>>>>> -    /*
> >>>>>> -     * In the absense of a wait_unlocked API,
> >>>>>> -     * Use the bo::wu_mutex to avoid triggering livelocks due to
> >>>>>> -     * concurrent use of this function. Note that this use of
> >>>>>> -     * bo::wu_mutex can go away if we change locking order to
> >>>>>> -     * mmap_sem -> bo::reserve.
> >>>>>> -     */
> >>>>>> -    ret = mutex_lock_interruptible(&bo->wu_mutex);
> >>>>>> -    if (unlikely(ret != 0))
> >>>>>> -        return -ERESTARTSYS;
> >>>>>> -    if (!dma_resv_is_locked(bo->base.resv))
> >>>>>> -        goto out_unlock;
> >>>>>> -    ret = dma_resv_lock_interruptible(bo->base.resv, NULL);
> >>>>>> -    if (ret == -EINTR)
> >>>>>> -        ret = -ERESTARTSYS;
> >>>>>> -    if (unlikely(ret != 0))
> >>>>>> -        goto out_unlock;
> >>>>>> -    dma_resv_unlock(bo->base.resv);
> >>>>>> -
> >>>>>> -out_unlock:
> >>>>>> -    mutex_unlock(&bo->wu_mutex);
> >>>>>> -    return ret;
> >>>>>> -}
> >>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>>>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>>>>> index 76eedb963693..505e1787aeea 100644
> >>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>>>>> @@ -125,31 +125,7 @@ static vm_fault_t ttm_bo_vm_fault(struct
> >>>>>> vm_fault *vmf)
> >>>>>>             &bdev->man[bo->mem.mem_type];
> >>>>>>         struct vm_area_struct cvma;
> >>>>>>     -    /*
> >>>>>> -     * Work around locking order reversal in fault / nopfn
> >>>>>> -     * between mmap_sem and bo_reserve: Perform a trylock operation
> >>>>>> -     * for reserve, and if it fails, retry the fault after waiting
> >>>>>> -     * for the buffer to become unreserved.
> >>>>>> -     */
> >>>>>> -    if (unlikely(!dma_resv_trylock(bo->base.resv))) {
> >>>>>> -        if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
> >>>>>> -            if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) {
> >>>>>> -                ttm_bo_get(bo);
> >>>>>> - up_read(&vmf->vma->vm_mm->mmap_sem);
> >>>>>> -                (void) ttm_bo_wait_unreserved(bo);
> >>>>>> -                ttm_bo_put(bo);
> >>>>>> -            }
> >>>>>> -
> >>>>>> -            return VM_FAULT_RETRY;
> >>>>>> -        }
> >>>>>> -
> >>>>>> -        /*
> >>>>>> -         * If we'd want to change locking order to
> >>>>>> -         * mmap_sem -> bo::reserve, we'd use a blocking reserve here
> >>>>>> -         * instead of retrying the fault...
> >>>>>> -         */
> >>>>> I think you should justify why the above code is removed, since the
> >>>>> comments actually outlines what to do if we change locking order.
> >>>>>
> >>>>> The code that's removed above is not for adjusting locking orders but
> >>>>> to decrease the mm latency by releasing the mmap_sem while waiting for
> >>>>> bo reserve which in turn might be waiting for GPU. At a minimum we
> >>>>> should have a separate patch with justification.
> >>>>>
> >>>>> Note that the caller here ensures locking progress by adjusting the
> >>>>> RETRY flags after a retry.
> >>> That would be patches 1&2 in this series.
> >>>
> >> Hmm? Those seem to touch only dma-buf and nouveau not ttm?  I mean this
> >> patch should look along the lines of (based on an older tree) to
> >> implement the new locking-order mmap_sem->reservation,
> > Only nouveau was breaking was doing copy_*_user or get_user_pages
> > while holding dma_resv locks, no one else. So nothing else needed to
> > be changed. But patch 1 contains the full audit. I might have missed
> > something.
> >
> >> but to keep the mm latency optimization using the RETRY functionality:
> > Still no idea why this is needed? All the comments here and the code
> > and history seem like they've been about the mmap_sem vs dma_resv
> > inversion between driver ioctls and fault handling here. Once that's
> > officially fixed there's no reason to play games here and retry loops
> > - previously that was necessary because the old ttm_bo_vm_fault had a
> > busy spin and that's definitely not nice. If it's needed I think it
> > should be a second patch on top, to keep this all clear. I had to
> > audit an enormous amount of code, I'd like to make sure I didn't miss
> > anything before we start to make this super fancy again. Further
> > patches on top is obviously all fine with me.
> > -Daniel
>
> Yes, but there are two different things you remove here. One is the
> workaround for the locking reversal, which is obviously correct.
>
> One is TTM's implementation of the mmap_sem latency optimization, which
> looks like an oversight.
>
> That optimization is why the VM_FAULT_RETRY functionality was added to
> mm in the first place and is intended to be used when drivers expect a
> substantial sleep to avoid keeping the pretty globalish mmap_sem held
> while that sleep is taking place. We do exactly the same while waiting
> for move-fences (ttm_bo_vm_fault_idle) and other drivers that expect
> long waits in the fault handler do the same.

Hm, are we guaranteed that core mm will only call us once with
ALLOW_RETRY? Just to make sure that we're not live-locking here. I'd
also like to get rid of the wu_mutex, that just looks really strange
(and I thought it was to duct-tape over the inversion, not as an
optimization). If the livelock due to locking inversion is gone I have
no idea anymore why we even needs the wu_mutex.

> To avoid this accidently happening there was even this comment:
>
>          /*
>           * If we'd want to change locking order to
>           * mmap_sem -> bo::reserve, we'd use a blocking reserve here
>           * instead of retrying the fault...
>           */
>          return VM_FAULT_NOPAGE;
>
> And I do really think we should avoid accidently removing this just to
> re-add it in a later patch, particularly when I pointed it out at review
> time.

Yeah I read that, but I didn't read this comment the way you now explained it.
-Daniel

--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel