When ttm_bo_init() fails, the reservation mutex should be unlocked. In debug build, the kernel reported "possible recursive locking detected" in this codepath. For debugging purposes, I also added a "WARN_ON(ww_mutex_is_locked())" when ttm_bo_init() fails and the mutex was locked as expected. This should fix (random) GPU hangs. The easy way to reproduce the issue is to change the "Super Sampling" option from 1.0 to 2.0 in Hitman. It will create a huge buffer, evict a bunch of buffers (around ~5k) and deadlock. This regression has been introduced pretty recently. Fixes: "drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)" Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com> --- Here's the report: https://hastebin.com/durodivoma.xml drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index d1ef1d064de4..531e16ce256e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -403,8 +403,10 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev, &bo->placement, page_align, !kernel, NULL, acc_size, sg, resv ? resv : &bo->tbo.ttm_resv, &amdgpu_ttm_bo_destroy); - if (unlikely(r != 0)) + if (unlikely(r != 0)) { + ww_mutex_unlock(&bo->tbo.resv->lock); return r; + } bo->tbo.priority = ilog2(bo->tbo.num_pages); if (kernel) -- 2.11.1