This is a note to let you know that I've just added the patch titled drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram to the 5.15-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: drm-amdgpu-fix-null-pointer-dereference-error-in-amd.patch and it can be found in the queue-5.15 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit ee1fdf315fd19f56c36d1f13bc983c841addce4f Author: Horatio Zhang <Hongkun.Zhang@xxxxxxx> Date: Mon May 29 14:23:37 2023 -0400 drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram [ Upstream commit 2a1eb1a343208ce7d6839b73d62aece343e693ff ] Use the function of amdgpu_bo_vm_destroy to handle the resource release of shadow bo. During the amdgpu_mes_self_test, shadow bo released, but vmbo->shadow_list was not, which caused a null pointer reference error in amdgpu_device_recover_vram when GPU reset. Fixes: 6c032c37ac3e ("drm/amdgpu: Fix vram recover doesn't work after whole GPU reset (v2)") Signed-off-by: xinhui pan <xinhui.pan@xxxxxxx> Signed-off-by: Horatio Zhang <Hongkun.Zhang@xxxxxxx> Acked-by: Feifei Xu <Feifei.Xu@xxxxxxx> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index a0b1bf17cb74b..d03a4519f945b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -78,9 +78,10 @@ static void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo) static void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo) { struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev); - struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); + struct amdgpu_bo *shadow_bo = ttm_to_amdgpu_bo(tbo), *bo; struct amdgpu_bo_vm *vmbo; + bo = shadow_bo->parent; vmbo = to_amdgpu_bo_vm(bo); /* in case amdgpu_device_recover_vram got NULL of bo->parent */ if (!list_empty(&vmbo->shadow_list)) { @@ -690,7 +691,6 @@ int amdgpu_bo_create_vm(struct amdgpu_device *adev, return r; *vmbo_ptr = to_amdgpu_bo_vm(bo_ptr); - INIT_LIST_HEAD(&(*vmbo_ptr)->shadow_list); return r; } @@ -741,6 +741,8 @@ void amdgpu_bo_add_to_shadow_list(struct amdgpu_bo_vm *vmbo) mutex_lock(&adev->shadow_list_lock); list_add_tail(&vmbo->shadow_list, &adev->shadow_list); + vmbo->shadow->parent = amdgpu_bo_ref(&vmbo->bo); + vmbo->shadow->tbo.destroy = &amdgpu_bo_vm_destroy; mutex_unlock(&adev->shadow_list_lock); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 01710cd0d9727..924c6d5f86203 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -983,7 +983,6 @@ static int amdgpu_vm_pt_create(struct amdgpu_device *adev, return r; } - (*vmbo)->shadow->parent = amdgpu_bo_ref(bo); amdgpu_bo_add_to_shadow_list(*vmbo); return 0;