On Wed, Aug 21, 2019 at 5:19 PM Thomas Hellström (VMware) <thomas_os@xxxxxxxxxxxx> wrote: > On 8/21/19 5:14 PM, Daniel Vetter wrote: > > On Wed, Aug 21, 2019 at 5:03 PM Thomas Hellström (VMware) > > <thomas_os@xxxxxxxxxxxx> wrote: > >> On 8/21/19 4:47 PM, Daniel Vetter wrote: > >>> On Wed, Aug 21, 2019 at 4:27 PM Thomas Hellström (VMware) > >>> <thomas_os@xxxxxxxxxxxx> wrote: > >>>> On 8/21/19 4:09 PM, Daniel Vetter wrote: > >>>>> On Wed, Aug 21, 2019 at 2:47 PM Thomas Hellström (VMware) > >>>>> <thomas_os@xxxxxxxxxxxx> wrote: > >>>>>> On 8/21/19 2:40 PM, Thomas Hellström (VMware) wrote: > >>>>>>> On 8/20/19 4:53 PM, Daniel Vetter wrote: > >>>>>>>> With nouveau fixed all ttm-using drives have the correct nesting of > >>>>>>>> mmap_sem vs dma_resv, and we can just lock the buffer. > >>>>>>>> > >>>>>>>> Assuming I didn't screw up anything with my audit of course. > >>>>>>>> > >>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> > >>>>>>>> Cc: Christian Koenig <christian.koenig@xxxxxxx> > >>>>>>>> Cc: Huang Rui <ray.huang@xxxxxxx> > >>>>>>>> Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx> > >>>>>>>> Cc: "VMware Graphics" <linux-graphics-maintainer@xxxxxxxxxx> > >>>>>>>> Cc: Thomas Hellstrom <thellstrom@xxxxxxxxxx> > >>>>>>>> --- > >>>>>>>> drivers/gpu/drm/ttm/ttm_bo.c | 34 --------------------------------- > >>>>>>>> drivers/gpu/drm/ttm/ttm_bo_vm.c | 26 +------------------------ > >>>>>>>> include/drm/ttm/ttm_bo_api.h | 1 - > >>>>>>>> 3 files changed, 1 insertion(+), 60 deletions(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > >>>>>>>> index 20ff56f27aa4..a952dd624b06 100644 > >>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c > >>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c > >>>>>>>> @@ -1954,37 +1954,3 @@ void ttm_bo_swapout_all(struct ttm_bo_device > >>>>>>>> *bdev) > >>>>>>>> ; > >>>>>>>> } > >>>>>>>> EXPORT_SYMBOL(ttm_bo_swapout_all); > >>>>>>>> - > >>>>>>>> -/** > >>>>>>>> - * ttm_bo_wait_unreserved - interruptible wait for a buffer object > >>>>>>>> to become > >>>>>>>> - * unreserved > >>>>>>>> - * > >>>>>>>> - * @bo: Pointer to buffer > >>>>>>>> - */ > >>>>>>>> -int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo) > >>>>>>>> -{ > >>>>>>>> - int ret; > >>>>>>>> - > >>>>>>>> - /* > >>>>>>>> - * In the absense of a wait_unlocked API, > >>>>>>>> - * Use the bo::wu_mutex to avoid triggering livelocks due to > >>>>>>>> - * concurrent use of this function. Note that this use of > >>>>>>>> - * bo::wu_mutex can go away if we change locking order to > >>>>>>>> - * mmap_sem -> bo::reserve. > >>>>>>>> - */ > >>>>>>>> - ret = mutex_lock_interruptible(&bo->wu_mutex); > >>>>>>>> - if (unlikely(ret != 0)) > >>>>>>>> - return -ERESTARTSYS; > >>>>>>>> - if (!dma_resv_is_locked(bo->base.resv)) > >>>>>>>> - goto out_unlock; > >>>>>>>> - ret = dma_resv_lock_interruptible(bo->base.resv, NULL); > >>>>>>>> - if (ret == -EINTR) > >>>>>>>> - ret = -ERESTARTSYS; > >>>>>>>> - if (unlikely(ret != 0)) > >>>>>>>> - goto out_unlock; > >>>>>>>> - dma_resv_unlock(bo->base.resv); > >>>>>>>> - > >>>>>>>> -out_unlock: > >>>>>>>> - mutex_unlock(&bo->wu_mutex); > >>>>>>>> - return ret; > >>>>>>>> -} > >>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >>>>>>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >>>>>>>> index 76eedb963693..505e1787aeea 100644 > >>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >>>>>>>> @@ -125,31 +125,7 @@ static vm_fault_t ttm_bo_vm_fault(struct > >>>>>>>> vm_fault *vmf) > >>>>>>>> &bdev->man[bo->mem.mem_type]; > >>>>>>>> struct vm_area_struct cvma; > >>>>>>>> - /* > >>>>>>>> - * Work around locking order reversal in fault / nopfn > >>>>>>>> - * between mmap_sem and bo_reserve: Perform a trylock operation > >>>>>>>> - * for reserve, and if it fails, retry the fault after waiting > >>>>>>>> - * for the buffer to become unreserved. > >>>>>>>> - */ > >>>>>>>> - if (unlikely(!dma_resv_trylock(bo->base.resv))) { > >>>>>>>> - if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) { > >>>>>>>> - if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { > >>>>>>>> - ttm_bo_get(bo); > >>>>>>>> - up_read(&vmf->vma->vm_mm->mmap_sem); > >>>>>>>> - (void) ttm_bo_wait_unreserved(bo); > >>>>>>>> - ttm_bo_put(bo); > >>>>>>>> - } > >>>>>>>> - > >>>>>>>> - return VM_FAULT_RETRY; > >>>>>>>> - } > >>>>>>>> - > >>>>>>>> - /* > >>>>>>>> - * If we'd want to change locking order to > >>>>>>>> - * mmap_sem -> bo::reserve, we'd use a blocking reserve here > >>>>>>>> - * instead of retrying the fault... > >>>>>>>> - */ > >>>>>>> I think you should justify why the above code is removed, since the > >>>>>>> comments actually outlines what to do if we change locking order. > >>>>>>> > >>>>>>> The code that's removed above is not for adjusting locking orders but > >>>>>>> to decrease the mm latency by releasing the mmap_sem while waiting for > >>>>>>> bo reserve which in turn might be waiting for GPU. At a minimum we > >>>>>>> should have a separate patch with justification. > >>>>>>> > >>>>>>> Note that the caller here ensures locking progress by adjusting the > >>>>>>> RETRY flags after a retry. > >>>>> That would be patches 1&2 in this series. > >>>>> > >>>> Hmm? Those seem to touch only dma-buf and nouveau not ttm? I mean this > >>>> patch should look along the lines of (based on an older tree) to > >>>> implement the new locking-order mmap_sem->reservation, > >>> Only nouveau was breaking was doing copy_*_user or get_user_pages > >>> while holding dma_resv locks, no one else. So nothing else needed to > >>> be changed. But patch 1 contains the full audit. I might have missed > >>> something. > >>> > >>>> but to keep the mm latency optimization using the RETRY functionality: > >>> Still no idea why this is needed? All the comments here and the code > >>> and history seem like they've been about the mmap_sem vs dma_resv > >>> inversion between driver ioctls and fault handling here. Once that's > >>> officially fixed there's no reason to play games here and retry loops > >>> - previously that was necessary because the old ttm_bo_vm_fault had a > >>> busy spin and that's definitely not nice. If it's needed I think it > >>> should be a second patch on top, to keep this all clear. I had to > >>> audit an enormous amount of code, I'd like to make sure I didn't miss > >>> anything before we start to make this super fancy again. Further > >>> patches on top is obviously all fine with me. > >>> -Daniel > >> Yes, but there are two different things you remove here. One is the > >> workaround for the locking reversal, which is obviously correct. > >> > >> One is TTM's implementation of the mmap_sem latency optimization, which > >> looks like an oversight. > >> > >> That optimization is why the VM_FAULT_RETRY functionality was added to > >> mm in the first place and is intended to be used when drivers expect a > >> substantial sleep to avoid keeping the pretty globalish mmap_sem held > >> while that sleep is taking place. We do exactly the same while waiting > >> for move-fences (ttm_bo_vm_fault_idle) and other drivers that expect > >> long waits in the fault handler do the same. > > Hm, are we guaranteed that core mm will only call us once with > > ALLOW_RETRY? > > Last time I looked in the implementation, yes. The ALLOW_RETRY was there > to specifically > allow making progress in the locking. > > > Just to make sure that we're not live-locking here. I'd > > also like to get rid of the wu_mutex, that just looks really strange > > (and I thought it was to duct-tape over the inversion, not as an > > optimization). If the livelock due to locking inversion is gone I have > > no idea anymore why we even needs the wu_mutex. > > Yes, my interpretation of this is that wu_mutex definitely can be ditched. Ok I'll respin and just do normal interruptible locks, keeping the mmap_sem-less path. I think perfectly ok to leave the optimization in, as long as we can remove the wu_mutex. btw r-b/t-b on patch 1 from vmwgfx would be very much appreciated. That one is the real trick in this series here I think. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel