Re: [PATCH] drm/amdkfd: fix some race conditions in vram buffer alloc/free of svm code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 9/20/2023 9:55 AM, Felix Kuehling wrote:

On 2023-09-20 2:17, Xiaogang.Chen wrote:
From: Xiaogang Chen <xiaogang.chen@xxxxxxx>

This patch fixes:
1: ref number of prange's svm_bo got decreased by an async call from hmm. When wait svm_bo of prange got released we shoul also wait prang->svm_bo become NULL, otherwise prange->svm_bo may be set to null after allocate new vram buffer.

I agree with this part.



2: During waiting svm_bo of prange got released in a while loop should schedule current task to give other tasks oppotunity to run, specially the the workque task that handles svm_bo ref release, otherwise we may enter to softlock.

We had a similar discussion a few weeks back for another soft lock and I pointed to  cond_reschedule, which seems to be the preferred way to avoid soft locks in the kernel. Does cond_reschedule work for this case?

cond_resched() also works. I will send new one to use cond_resched() that is safer for schedule.

Regards

Xiaogang


Regards,
  Felix



Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@xxxxxxx>
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 8 ++++----
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index bed0f8bf83c7..1074a4aedf57 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -502,11 +502,11 @@ svm_range_validate_svm_bo(struct kfd_node *node, struct svm_range *prange)
        /* We need a new svm_bo. Spin-loop to wait for concurrent
       * svm_range_bo_release to finish removing this range from
-     * its range list. After this, it is safe to reuse the
-     * svm_bo pointer and svm_bo_list head.
+     * its range list and set prange->svm_bo to null. After this,
+     * it is safe to reuse the svm_bo pointer and svm_bo_list head.
       */
-    while (!list_empty_careful(&prange->svm_bo_list))
-        ;
+    while (!list_empty_careful(&prange->svm_bo_list) || prange->svm_bo)
+        schedule();
        return false;
  }



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux