[PATCH 08/34] drm/amdkfd: fix kfd_suspend_all_processes for gfx941 debugging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The debugger for GFX9.4.1 uses kfd_suspend_all_processes to pause the
compute pipe line so it can safely toggle the SQ's implicit wait on
barrier setting during debug attach/detach to work around the wave
exception s_barrier race condition.

For mGPU setups, repeated calls to cancel all outstanding restore work can
result in an assymetric permanent cancelling of the restored work from the
debug device after it has toggled the HW work around settings.
Instead of cancelling the outstanding restore work, just flush it as it
will be properly evicted anyways by the current suspend call.

Signed-off-by: Jonathan Kim <jonathan.kim@xxxxxxx>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 1e3795e7e18d..55a4ddd35e12 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -2008,7 +2008,7 @@ void kfd_suspend_all_processes(void)
 	WARN(debug_evictions, "Evicting all processes");
 	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
 		cancel_delayed_work_sync(&p->eviction_work);
-		cancel_delayed_work_sync(&p->restore_work);
+		flush_delayed_work(&p->restore_work);
 
 		if (kfd_process_evict_queues(p, KFD_QUEUE_EVICTION_TRIGGER_SUSPEND))
 			pr_err("Failed to suspend process 0x%x\n", p->pasid);
-- 
2.25.1




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux