I was just to complain that this is certainly incorrect.
But it's strange that ttm_mem_evict_first causes the warning in the
first place since it should never try to evict a BO which is about to be
destroyed.
Regards,
Christian.
Am 05.07.23 um 10:31 schrieb Lang Yu:
Please ignore this patch, it will cause another issue.
Will send a new one.
Regards,
Lang
On 07/05/ , Lang Yu wrote:
[ 67.399887] refcount_t: underflow; use-after-free.
[ 67.399901] WARNING: CPU: 0 PID: 3172 at lib/refcount.c:28 refcount_warn_saturate+0xc2/0x110
[ 67.400124] RIP: 0010:refcount_warn_saturate+0xc2/0x110
[ 67.400173] Call Trace:
[ 67.400176] <TASK>
[ 67.400181] ttm_mem_evict_first+0x4fe/0x5b0 [ttm]
[ 67.400216] ttm_bo_mem_space+0x1e3/0x240 [ttm]
[ 67.400239] ttm_bo_validate+0xc7/0x190 [ttm]
[ 67.400253] ? ww_mutex_trylock+0x1b1/0x390
[ 67.400266] ttm_bo_init_reserved+0x183/0x1c0 [ttm]
[ 67.400280] ? __rwlock_init+0x3d/0x70
[ 67.400292] amdgpu_bo_create+0x1cd/0x4f0 [amdgpu]
[ 67.400607] ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
[ 67.400980] amdgpu_bo_create_user+0x38/0x70 [amdgpu]
[ 67.401291] amdgpu_gem_object_create+0x77/0xb0 [amdgpu]
[ 67.401641] ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
[ 67.401958] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x228/0xa30 [amdgpu]
[ 67.402433] kfd_ioctl_alloc_memory_of_gpu+0x14e/0x390 [amdgpu]
[ 67.402824] ? lock_release+0x13f/0x290
[ 67.402838] kfd_ioctl+0x1e0/0x640 [amdgpu]
[ 67.403205] ? __pfx_kfd_ioctl_alloc_memory_of_gpu+0x10/0x10 [amdgpu]
[ 67.403579] ? tomoyo_file_ioctl+0x19/0x20
[ 67.403590] __x64_sys_ioctl+0x95/0xd0
[ 67.403601] do_syscall_64+0x3b/0x90
[ 67.403609] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers")
Signed-off-by: Lang Yu <Lang.Yu@xxxxxxx>
---
drivers/gpu/drm/ttm/ttm_bo.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index bd5dae4d1624..e047b191001c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -308,6 +308,9 @@ static void ttm_bo_delayed_delete(struct work_struct *work)
bo = container_of(work, typeof(*bo), delayed_delete);
+ if (!ttm_bo_get_unless_zero(bo))
+ return;
+
dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP, false,
MAX_SCHEDULE_TIMEOUT);
dma_resv_lock(bo->base.resv, NULL);
--
2.25.1