On Thu, 2021-05-27 at 16:54 +0200, Christian König wrote: > Am 27.05.21 um 16:19 schrieb Thomas Hellström: > > The swapping code was dereference bo->ttm pointers without having > > the > > dma-resv lock held. Also it might try to swap out unpopulated bos. > > > > Fix this by moving the bo->ttm dereference until we have the > > reservation > > lock. Check that the ttm_tt is populated after the swap_notify > > callback. > > > > Signed-off-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/ttm/ttm_bo.c | 16 +++++++++++++++- > > drivers/gpu/drm/ttm/ttm_device.c | 8 +++----- > > 2 files changed, 18 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c > > b/drivers/gpu/drm/ttm/ttm_bo.c > > index 9f53506a82fc..86213d37657b 100644 > > --- a/drivers/gpu/drm/ttm/ttm_bo.c > > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > > @@ -1163,6 +1163,16 @@ int ttm_bo_swapout(struct ttm_buffer_object > > *bo, struct ttm_operation_ctx *ctx, > > if (!ttm_bo_evict_swapout_allowable(bo, ctx, &place, > > &locked, NULL)) > > return -EBUSY; > > > > + dma_resv_assert_held(bo->base.resv); > > + > > + if (!bo->ttm || > > + bo->ttm->page_flags & TTM_PAGE_FLAG_SG || > > + bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) { > > + if (locked) > > + dma_resv_unlock(bo->base.resv); > > + return -EBUSY; > > + } > > + > > if (!ttm_bo_get_unless_zero(bo)) { > > if (locked) > > dma_resv_unlock(bo->base.resv); > > @@ -1215,7 +1225,8 @@ int ttm_bo_swapout(struct ttm_buffer_object > > *bo, struct ttm_operation_ctx *ctx, > > if (bo->bdev->funcs->swap_notify) > > bo->bdev->funcs->swap_notify(bo); > > > > - ret = ttm_tt_swapout(bo->bdev, bo->ttm, gfp_flags); > > + if (ttm_tt_is_populated(bo->ttm)) > > + ret = ttm_tt_swapout(bo->bdev, bo->ttm, gfp_flags); > > Exactly that is what I won't recommend. We would try to swap out the > same BO over and over again with that. But we wouldn't since the BO is taken off the LRU and never re-added, > > Why not move that to the check above as well? Because the BO may become unpopulated in swap_notify(), i915, like vmwgfx, sometimes sets up gpu bindings from system, and when we get a notification from user-space that those are purgeable, we don't want to purge immediately but wait for a potential swapout. /Thomas > > Christian. > > > out: > > > > /* > > @@ -1225,6 +1236,9 @@ int ttm_bo_swapout(struct ttm_buffer_object > > *bo, struct ttm_operation_ctx *ctx, > > if (locked) > > dma_resv_unlock(bo->base.resv); > > ttm_bo_put(bo); > > + > > + /* Don't break locking rules. */ > > + WARN_ON(ret == -EBUSY); > > return ret; > > } > > > > diff --git a/drivers/gpu/drm/ttm/ttm_device.c > > b/drivers/gpu/drm/ttm/ttm_device.c > > index 460953dcad11..eaa7487ae404 100644 > > --- a/drivers/gpu/drm/ttm/ttm_device.c > > +++ b/drivers/gpu/drm/ttm/ttm_device.c > > @@ -143,14 +143,12 @@ int ttm_device_swapout(struct ttm_device > > *bdev, struct ttm_operation_ctx *ctx, > > > > for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) { > > list_for_each_entry(bo, &man->lru[j], lru) > > { > > - uint32_t num_pages; > > + pgoff_t num_pages; > > > > - if (!bo->ttm || > > - bo->ttm->page_flags & > > TTM_PAGE_FLAG_SG || > > - bo->ttm->page_flags & > > TTM_PAGE_FLAG_SWAPPED) > > + if (!READ_ONCE(bo->ttm)) > > continue; > > > > - num_pages = bo->ttm->num_pages; > > + num_pages = bo->base.size >> > > PAGE_SHIFT; > > ret = ttm_bo_swapout(bo, ctx, > > gfp_flags); > > /* ttm_bo_swapout has dropped the > > lru_lock */ > > if (!ret) >