Am 10.09.2018 um 06:07 schrieb Emily Deng: > It will ramdomly have the dead lock issue when test TDR: > 1. amdgpu_device_handle_vram_lost gets the lock shadow_list_lock > 2. amdgpu_bo_create locked the bo's resv lock > 3. amdgpu_bo_create_shadow is waiting for the shadow_list_lock > 4. amdgpu_device_recover_vram_from_shadow is waiting for the bo's resv > lock. > > Signed-off-by: Emily Deng <Emily.Deng at amd.com> Good catch, problem is the fix doesn't work like this because the lock won't be dropped at this point in most cases. Instead you need to fix amdgpu_device_recover_vram_from_shadow to make a local copy of the list. You can do this by grabbing a reference to each BO and moving it to a local list. Regards, Christian. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > index de990bd..c75447d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > @@ -557,12 +557,8 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev, > bp.resv = bo->tbo.resv; > > r = amdgpu_bo_do_create(adev, &bp, &bo->shadow); > - if (!r) { > + if (!r) > bo->shadow->parent = amdgpu_bo_ref(bo); > - mutex_lock(&adev->shadow_list_lock); > - list_add_tail(&bo->shadow_list, &adev->shadow_list); > - mutex_unlock(&adev->shadow_list_lock); > - } > > return r; > } > @@ -603,6 +599,12 @@ int amdgpu_bo_create(struct amdgpu_device *adev, > if (!bp->resv) > reservation_object_unlock((*bo_ptr)->tbo.resv); > > + if (!r) { > + mutex_lock(&adev->shadow_list_lock); > + list_add_tail(&(*bo_ptr)->shadow_list, &adev->shadow_list); > + mutex_unlock(&adev->shadow_list_lock); > + } > + > if (r) > amdgpu_bo_unref(bo_ptr); > }