Am 07.04.24 um 06:52 schrieb Lang Yu:
When VM is in evicting state, amdgpu_vm_update_range would return -EBUSY.
Then restore_process_worker runs into a dead loop.
Fixes: 2fdba514ad5a ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs")
Mhm, while it would be good to have this case handled as error it should
never occur in practice since we should have validated the VM before
validating the DMA-bufs.
@Felix isn't that something we have taken care of?
Regards,
Christian.
Signed-off-by: Lang Yu <Lang.Yu@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 0ae9fd844623..8c71fe07807a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2900,6 +2900,12 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence __rcu *
amdgpu_sync_create(&sync_obj);
+ ret = process_validate_vms(process_info, NULL);
+ if (ret) {
+ pr_debug("Validating VMs failed, ret: %d\n", ret);
+ goto validate_map_fail;
+ }
+
/* Validate BOs and map them to GPUVM (update VM page tables). */
list_for_each_entry(mem, &process_info->kfd_bo_list,
validate_list) {