On 10/12/2023 12:48 PM, Felix Kuehling wrote:
On 2023-10-12 12:34, Xiaogang.Chen wrote:
From: Xiaogang Chen <xiaogang.chen@xxxxxxx>
This is needed to correctly handle BOs imported into compute VM from
gfx.
Both kfd and gfx should use same bo_va and set bo_va->ref_count
correctly
when map the Bos into same VM, otherwise we may trigger kernel general
protection when iterate mappings over bo_va's valids or invalids list.
Signed-off-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
Signed-off-by: Xiaogang Chen <Xiaogang.Chen@xxxxxxx>
Acked-by: Christian König <christian.koenig@xxxxxxx>
Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu@xxxxxxx>
Tested-by: Xiaogang Chen <Xiaogang.Chen@xxxxxxx>
Not sure if it makes sense to add my Reviewed-by, given that I mostly
wrote this patch. But feel free to submit this to the branch.
ok, will submit it.
Thanks
Xiaogang
Thanks,
Felix
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index a15e59abe70a..c1ec93cc50ae 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -832,6 +832,7 @@ static int kfd_mem_attach(struct amdgpu_device
*adev, struct kgd_mem *mem,
uint64_t va = mem->va;
struct kfd_mem_attachment *attachment[2] = {NULL, NULL};
struct amdgpu_bo *bo[2] = {NULL, NULL};
+ struct amdgpu_bo_va *bo_va;
bool same_hive = false;
int i, ret;
@@ -919,7 +920,13 @@ static int kfd_mem_attach(struct amdgpu_device
*adev, struct kgd_mem *mem,
pr_debug("Unable to reserve BO during memory attach");
goto unwind;
}
- attachment[i]->bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
+ bo_va = amdgpu_vm_bo_find(vm, bo[i]);
+ if (!bo_va)
+ bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
+ else
+ ++bo_va->ref_count;
+ attachment[i]->bo_va = bo_va;
+
amdgpu_bo_unreserve(bo[i]);
if (unlikely(!attachment[i]->bo_va)) {
ret = -ENOMEM;
@@ -943,7 +950,8 @@ static int kfd_mem_attach(struct amdgpu_device
*adev, struct kgd_mem *mem,
continue;
if (attachment[i]->bo_va) {
amdgpu_bo_reserve(bo[i], true);
- amdgpu_vm_bo_del(adev, attachment[i]->bo_va);
+ if (--attachment[i]->bo_va->ref_count == 0)
+ amdgpu_vm_bo_del(adev, attachment[i]->bo_va);
amdgpu_bo_unreserve(bo[i]);
list_del(&attachment[i]->list);
}