Re: [PATCH 1/2] amd/amdkfd: sync all devices to wait all processes being evicted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 4/1/2024 4:53 PM, Zhigang Luo wrote:
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


If there are more than one device doing reset in parallel, the first
device will call kfd_suspend_all_processes() to evict all processes
on all devices, this call takes time to finish. other device will
start reset and recover without waiting. if the process has not been
evicted before doing recover, it will be restored, then caused page
fault.

Signed-off-by: Zhigang Luo <Zhigang.Luo@xxxxxxx>
Change-Id: Ib1eddb56b69ecd41fe703abd169944154f48b0cd
---
  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 041ec3de55e7..55f89c858c7a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -969,11 +969,11 @@ void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm)
         if (!run_pm) {
                 mutex_lock(&kfd_processes_mutex);
                 count = ++kfd_locked;
-               mutex_unlock(&kfd_processes_mutex);

                 /* For first KFD device suspend all the KFD processes */
                 if (count == 1)
                         kfd_suspend_all_processes();
+               mutex_unlock(&kfd_processes_mutex);
         }

I do not understand why use kfd_lock here. You want evict all processes when first device got suspended. The kfd_lock indicates if all kfd driver functions got locked. It is not same meaning as device suspend. That is not your patch issue, but I think using different flag to record device suspend is better. ex, if kfd_lock got set for some other reasons, we would skip evicting processes here.

Regards

Xiaogang

         for (i = 0; i < kfd->num_nodes; i++) {
--
2.25.1




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux