On 2022-07-07 12:45, Philip Yang wrote:
MMU notifier callback unmap the svm range update page table may free the
PTB BO, then amdgpu_fill_buffer zero BO memory could cause deadlock as
kmalloc may trigger MMU notifier.
amdgpu_vm_pt_clear setup PTB BO memory with initial value, and no
sensitive data in page table that must be wiped out before releasing the
memory. So don't clear the memory when PTB BO is released.
The problem happens if the memory is used for a non-pagetable BO after
it has been freed. In that case it doesn't get initialized. That can
leak data from the page table to a user mode application. And it can
leak RAS poison to a new user mode allocation.
Therefore this memory must be wiped on free. If this causes issues in
the context of the MMU notifier, then the wiping of memory should be
deferred to a worker thread. The code for delayed freeing already exists
for cases where memory has an unsignaled fence when it is freed. So it
should be possible to create a fence, attach it to the BO before freeing
it (maybe after it is individualized in amdgpu_bo_release_notify), and
only signal it after freeing the BO. Or maybe there is a more
straight-forward way to force delayed freeing of PT BOs in the MMU notifier.
Another alternative is to ensure that kmalloc in amdgpu_fill_buffer can
never cause a recursive MMU notifiers by using GFP_ATOMIC or
memalloc_noreclaim_save/restore. BTW, I don't see where the kmalloc is
happening. I guess it's somewhere lower in the call stack.
Regard,
Felix
Signed-off-by: Philip Yang <Philip.Yang@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8a7b0f6162da..65b4ff6979ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -576,7 +576,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
if (!amdgpu_bo_support_uswc(bo->flags))
bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC;
- if (adev->ras_enabled)
+ if (adev->ras_enabled && bp->type != ttm_bo_type_kernel)
bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
bo->tbo.bdev = &adev->mman.bdev;