Re: [EXTERNAL] [PATCH 2/2] drm/amdkfd: Add PCIe Hotplug Support for AMDKFD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 2022-04-19 um 12:01 schrieb Andrey Grodzovsky:
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -134,6 +134,7 @@ struct amdkfd_process_info {
/* MMU-notifier related fields */
atomic_t evicted_bos;
+atomic_t invalid;
struct delayed_work restore_userptr_work;
struct pid *pid;
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 99d2b15bcbf3..2a588eb9f456 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1325,6 +1325,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void **process_info,
info->pid = get_task_pid(current->group_leader, PIDTYPE_PID);
atomic_set(&info->evicted_bos, 0);
+atomic_set(&info->invalid, 0);
INIT_DELAYED_WORK(&info->restore_userptr_work,
 amdgpu_amdkfd_restore_userptr_worker);
@@ -2693,6 +2694,9 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct work_struct *work)
struct mm_struct *mm;
int evicted_bos;
+if (atomic_read(&process_info->invalid))
+return;
+


Probably better  to again use drm_dev_enter/exit guard pair instead of this flag.



I don’t know if I could use drm_dev_enter/exit efficiently because a process can have multiple drm_dev open. And I don’t know how I can recover/refer drm_dev(s) efficiently in the worker function in order to use drm_dev_enter/exit.


I think that within the KFD code each kfd device belongs or points to one specific drm_device so I don't think this is a problem.

Sorry, I haven't been following this discussion in all its details. But I don't see why you need to check a flag in the worker. If the GPU is unplugged you already cancel any pending work. How is new work getting scheduled after the GPU is unplugged? Is it due to pending interrupts or something? Can you instead invalidate process_info->restore_userptr_work to prevent it from being scheduled again? Or add some check where it's scheduling the work, instead of in the worker.

Regards,
  Felix





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux