On 2023-01-09 14:27, Eric Huang wrote:
There will be data corruption on vram allocated by svm if initialization is not being done. Adding sync is to resolve this issue. Signed-off-by: Eric Huang <jinhuieric.huang@xxxxxxx> --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index b8c9753a4818..344e20306635 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -574,6 +574,13 @@ svm_range_vram_node_new(struct amdgpu_device *adev, struct svm_range *prange, goto reserve_bo_failed; }
Thanks for catching this bug, we could optimize as clear VRAM is only needed for partial migration, ex pass the clear flag clear = (cpages != npages) from svm_migrate_vma_to_vram -> svm_migrate_copy_to_vram -> svm_range_vram_node_new.
Regards, Philip
+ r = amdgpu_bo_sync_wait(bo, AMDGPU_FENCE_OWNER_KFD, false); + if (r) { + pr_debug("failed %d to sync bo\n", r); + amdgpu_bo_unreserve(bo); + goto reserve_bo_failed; + } + r = dma_resv_reserve_fences(amdkcl_ttm_resvp(&bo->tbo), 1); if (r) { pr_debug("failed %d to reserve bo\n", r);