On Wed, Aug 21, 2019 at 5:03 PM Thomas Hellström (VMware) <thomas_os@xxxxxxxxxxxx> wrote: > On 8/21/19 4:47 PM, Daniel Vetter wrote: > > On Wed, Aug 21, 2019 at 4:27 PM Thomas Hellström (VMware) > > <thomas_os@xxxxxxxxxxxx> wrote: > >> On 8/21/19 4:09 PM, Daniel Vetter wrote: > >>> On Wed, Aug 21, 2019 at 2:47 PM Thomas Hellström (VMware) > >>> <thomas_os@xxxxxxxxxxxx> wrote: > >>>> On 8/21/19 2:40 PM, Thomas Hellström (VMware) wrote: > >>>>> On 8/20/19 4:53 PM, Daniel Vetter wrote: > >>>>>> With nouveau fixed all ttm-using drives have the correct nesting of > >>>>>> mmap_sem vs dma_resv, and we can just lock the buffer. > >>>>>> > >>>>>> Assuming I didn't screw up anything with my audit of course. > >>>>>> > >>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> > >>>>>> Cc: Christian Koenig <christian.koenig@xxxxxxx> > >>>>>> Cc: Huang Rui <ray.huang@xxxxxxx> > >>>>>> Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx> > >>>>>> Cc: "VMware Graphics" <linux-graphics-maintainer@xxxxxxxxxx> > >>>>>> Cc: Thomas Hellstrom <thellstrom@xxxxxxxxxx> > >>>>>> --- > >>>>>> drivers/gpu/drm/ttm/ttm_bo.c | 34 --------------------------------- > >>>>>> drivers/gpu/drm/ttm/ttm_bo_vm.c | 26 +------------------------ > >>>>>> include/drm/ttm/ttm_bo_api.h | 1 - > >>>>>> 3 files changed, 1 insertion(+), 60 deletions(-) > >>>>>> > >>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > >>>>>> index 20ff56f27aa4..a952dd624b06 100644 > >>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c > >>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c > >>>>>> @@ -1954,37 +1954,3 @@ void ttm_bo_swapout_all(struct ttm_bo_device > >>>>>> *bdev) > >>>>>> ; > >>>>>> } > >>>>>> EXPORT_SYMBOL(ttm_bo_swapout_all); > >>>>>> - > >>>>>> -/** > >>>>>> - * ttm_bo_wait_unreserved - interruptible wait for a buffer object > >>>>>> to become > >>>>>> - * unreserved > >>>>>> - * > >>>>>> - * @bo: Pointer to buffer > >>>>>> - */ > >>>>>> -int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo) > >>>>>> -{ > >>>>>> - int ret; > >>>>>> - > >>>>>> - /* > >>>>>> - * In the absense of a wait_unlocked API, > >>>>>> - * Use the bo::wu_mutex to avoid triggering livelocks due to > >>>>>> - * concurrent use of this function. Note that this use of > >>>>>> - * bo::wu_mutex can go away if we change locking order to > >>>>>> - * mmap_sem -> bo::reserve. > >>>>>> - */ > >>>>>> - ret = mutex_lock_interruptible(&bo->wu_mutex); > >>>>>> - if (unlikely(ret != 0)) > >>>>>> - return -ERESTARTSYS; > >>>>>> - if (!dma_resv_is_locked(bo->base.resv)) > >>>>>> - goto out_unlock; > >>>>>> - ret = dma_resv_lock_interruptible(bo->base.resv, NULL); > >>>>>> - if (ret == -EINTR) > >>>>>> - ret = -ERESTARTSYS; > >>>>>> - if (unlikely(ret != 0)) > >>>>>> - goto out_unlock; > >>>>>> - dma_resv_unlock(bo->base.resv); > >>>>>> - > >>>>>> -out_unlock: > >>>>>> - mutex_unlock(&bo->wu_mutex); > >>>>>> - return ret; > >>>>>> -} > >>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >>>>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >>>>>> index 76eedb963693..505e1787aeea 100644 > >>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >>>>>> @@ -125,31 +125,7 @@ static vm_fault_t ttm_bo_vm_fault(struct > >>>>>> vm_fault *vmf) > >>>>>> &bdev->man[bo->mem.mem_type]; > >>>>>> struct vm_area_struct cvma; > >>>>>> - /* > >>>>>> - * Work around locking order reversal in fault / nopfn > >>>>>> - * between mmap_sem and bo_reserve: Perform a trylock operation > >>>>>> - * for reserve, and if it fails, retry the fault after waiting > >>>>>> - * for the buffer to become unreserved. > >>>>>> - */ > >>>>>> - if (unlikely(!dma_resv_trylock(bo->base.resv))) { > >>>>>> - if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) { > >>>>>> - if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { > >>>>>> - ttm_bo_get(bo); > >>>>>> - up_read(&vmf->vma->vm_mm->mmap_sem); > >>>>>> - (void) ttm_bo_wait_unreserved(bo); > >>>>>> - ttm_bo_put(bo); > >>>>>> - } > >>>>>> - > >>>>>> - return VM_FAULT_RETRY; > >>>>>> - } > >>>>>> - > >>>>>> - /* > >>>>>> - * If we'd want to change locking order to > >>>>>> - * mmap_sem -> bo::reserve, we'd use a blocking reserve here > >>>>>> - * instead of retrying the fault... > >>>>>> - */ > >>>>> I think you should justify why the above code is removed, since the > >>>>> comments actually outlines what to do if we change locking order. > >>>>> > >>>>> The code that's removed above is not for adjusting locking orders but > >>>>> to decrease the mm latency by releasing the mmap_sem while waiting for > >>>>> bo reserve which in turn might be waiting for GPU. At a minimum we > >>>>> should have a separate patch with justification. > >>>>> > >>>>> Note that the caller here ensures locking progress by adjusting the > >>>>> RETRY flags after a retry. > >>> That would be patches 1&2 in this series. > >>> > >> Hmm? Those seem to touch only dma-buf and nouveau not ttm? I mean this > >> patch should look along the lines of (based on an older tree) to > >> implement the new locking-order mmap_sem->reservation, > > Only nouveau was breaking was doing copy_*_user or get_user_pages > > while holding dma_resv locks, no one else. So nothing else needed to > > be changed. But patch 1 contains the full audit. I might have missed > > something. > > > >> but to keep the mm latency optimization using the RETRY functionality: > > Still no idea why this is needed? All the comments here and the code > > and history seem like they've been about the mmap_sem vs dma_resv > > inversion between driver ioctls and fault handling here. Once that's > > officially fixed there's no reason to play games here and retry loops > > - previously that was necessary because the old ttm_bo_vm_fault had a > > busy spin and that's definitely not nice. If it's needed I think it > > should be a second patch on top, to keep this all clear. I had to > > audit an enormous amount of code, I'd like to make sure I didn't miss > > anything before we start to make this super fancy again. Further > > patches on top is obviously all fine with me. > > -Daniel > > Yes, but there are two different things you remove here. One is the > workaround for the locking reversal, which is obviously correct. > > One is TTM's implementation of the mmap_sem latency optimization, which > looks like an oversight. > > That optimization is why the VM_FAULT_RETRY functionality was added to > mm in the first place and is intended to be used when drivers expect a > substantial sleep to avoid keeping the pretty globalish mmap_sem held > while that sleep is taking place. We do exactly the same while waiting > for move-fences (ttm_bo_vm_fault_idle) and other drivers that expect > long waits in the fault handler do the same. Hm, are we guaranteed that core mm will only call us once with ALLOW_RETRY? Just to make sure that we're not live-locking here. I'd also like to get rid of the wu_mutex, that just looks really strange (and I thought it was to duct-tape over the inversion, not as an optimization). If the livelock due to locking inversion is gone I have no idea anymore why we even needs the wu_mutex. > To avoid this accidently happening there was even this comment: > > /* > * If we'd want to change locking order to > * mmap_sem -> bo::reserve, we'd use a blocking reserve here > * instead of retrying the fault... > */ > return VM_FAULT_NOPAGE; > > And I do really think we should avoid accidently removing this just to > re-add it in a later patch, particularly when I pointed it out at review > time. Yeah I read that, but I didn't read this comment the way you now explained it. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel