Re: [PATCH v2] drm/amdgpu: Fix the race condition for draining retry fault

Felix Kuehling <felix.kuehling@xxxxxxx> · Wed, 5 Mar 2025 20:24:44 -0500

On 2025-03-05 20:10, Emily Deng wrote:
Issue:
In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts
  has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu
is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages
continues execution and reaches svm_range_from_addr. This results in
a "failed to find prange..." error, causing the page recovery to fail.

How to fix:
Move the timestamp check code under the protection of svm->lock.

v2:
Make sure all right locks are released before go out.

Signed-off-by: Emily Deng <Emily.Deng@xxxxxxx>
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 ++++++++++++++++------------
  1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index d04725583f19..db898757f92e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
  		goto out;
  	}
  
-	/* check if this page fault time stamp is before svms->checkpoint_ts */
-	if (svms->checkpoint_ts[gpuidx] != 0) {
-		if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
-			pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
-			r = 0;
-			goto out;
-		} else
-			/* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
-			 * to zero to avoid following ts wrap around give wrong comparing
-			 */
-			svms->checkpoint_ts[gpuidx] = 0;
-	}
-
  	if (!p->xnack_enabled) {
  		pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
  		r = -EFAULT;
@@ -3040,6 +3027,23 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
  	mmap_read_lock(mm);
  retry_write_locked:
  	mutex_lock(&svms->lock);
+
+	/* check if this page fault time stamp is before svms->checkpoint_ts */
+	if (svms->checkpoint_ts[gpuidx] != 0) {
+		if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
+			pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
+			r = 0;
+			mutex_unlock(&svms->lock);
+			mmap_read_unlock(mm);
+			mmput(mm);
+			goto out;

I'd prefer goto out_unlock_svm. That way if the locking logic changes, 
we only need to update one place that releases all the right locks in 
the right order.

If you're worried about counting the fault with svm_range_count_fault, I 
recommend setting r = -EAGAIN and only counting faults when r != 
-EAGAIN. That way you only count successful and failed faults but not 
retried ones.

Regards,
  Felix


+		} else
+			/* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
+			 * to zero to avoid following ts wrap around give wrong comparing
+			 */
+			svms->checkpoint_ts[gpuidx] = 0;
+	}
+
  	prange = svm_range_from_addr(svms, addr, NULL);
  	if (!prange) {
  		pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",