The reservation object might be locked again by evict/swap after individualized. The race is like below. cpu 0 cpu 1 BO release BO evict or swap lock lru_lock ttm_bo_individualize_resv {resv = &_resv} ttm_bo_evict_swapout_allowable dma_resv_trylock(resv) ->release_notify() {BUG_ON(!trylock(resv))} if (!ttm_bo_get_unless_zero)) dma_resv_unlock(resv) unlock lru_lock To fix it simply, let's acquire lru_lock before resv trylock to avoid the race above. Signed-off-by: xinhui pan <xinhui.pan@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 928e8d57cd08..8f6da0034db9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -318,7 +318,9 @@ int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo) ef = container_of(dma_fence_get(&info->eviction_fence->base), struct amdgpu_amdkfd_fence, base); + spin_lock(&bo->tbo.bdev->lru_lock); BUG_ON(!dma_resv_trylock(bo->tbo.base.resv)); + spin_unlock(&bo->tbo.bdev->lru_lock); ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef); dma_resv_unlock(bo->tbo.base.resv); -- 2.25.1 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx