Re: [PATCH] drm/amdkfd: Add sync after creating vram bo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 2023-01-09 um 15:23 schrieb Felix Kuehling:
Am 2023-01-09 um 15:18 schrieb Philip Yang:

On 2023-01-09 14:27, Eric Huang wrote:
There will be data corruption on vram allocated by svm
if initialization is not being done. Adding sync is to
resolve this issue.

Signed-off-by: Eric Huang <jinhuieric.huang@xxxxxxx>
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 7 +++++++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index b8c9753a4818..344e20306635 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -574,6 +574,13 @@ svm_range_vram_node_new(struct amdgpu_device *adev, struct svm_range *prange,
          goto reserve_bo_failed;
      }

Thanks for catching this bug, we could optimize as clear VRAM is only needed for partial migration, ex pass the clear flag clear = (cpages != npages) from svm_migrate_vma_to_vram -> svm_migrate_copy_to_vram -> svm_range_vram_node_new.

I only see one call to svm_range_vram_node_new, and it passes "true" unconditionally. What am I missing?

I think you're suggesting an optimization in svm_migrate_copy_to_vram. I think that makes sense. We don't need to initialize VRAM if we know it's going to be overwritten immediately with data migrated from system memory. But that's outside the scope of this fix.

Regards,
  Felix



That said, if VRAM is not cleared, there will be no fence to wait for, so the amdgpu_bo_sync_wait call is basically a no-op with a little bit of overhead for creating and destroying an empty sync object.

Regards,
  Felix



Regards,

Philip

+    r = amdgpu_bo_sync_wait(bo, AMDGPU_FENCE_OWNER_KFD, false);
+    if (r) {
+        pr_debug("failed %d to sync bo\n", r);
+        amdgpu_bo_unreserve(bo);
+        goto reserve_bo_failed;
+    }
+
      r = dma_resv_reserve_fences(amdkcl_ttm_resvp(&bo->tbo), 1);
      if (r) {
          pr_debug("failed %d to reserve bo\n", r);



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux