Re: [PATCH 2/2] drm/ttm: Fix COW check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2021-07-09 3:37 p.m., Christian König wrote:
Am 09.07.21 um 21:31 schrieb Felix Kuehling:
On 2021-07-09 2:38 a.m., Christian König wrote:


Am 08.07.21 um 21:36 schrieb Alex Deucher:
From: Felix Kuehling <Felix.Kuehling@xxxxxxx>

KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
is_cow_mapping returns true for these mappings. Add a check for
vm_flags & VM_WRITE to avoid mmap failures on private read-only or
PROT_NONE mappings.

I'm pretty sure that this is not working as expected.

Not sure what you mean. Debugger access to the memory through the PROT_NONE VMAs is definitely working, with both ptrace and /proc/<pid>/mem.




Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
Signed-off-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
---
drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..a75e90c7d4aa 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -552,7 +552,7 @@ static const struct vm_operations_struct ttm_bo_vm_ops = {   int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object *bo)
  {
      /* Enforce no COW since would have really strange behavior with it. */
-    if (is_cow_mapping(vma->vm_flags))
+    if (is_cow_mapping(vma->vm_flags) && (vma->vm_flags & VM_WRITE))

is_cow_mapping() already checks for VM_MAYWRITE, so this here shouldn't be necessary.

AFAICT, VM_MAYWRITE is not based on the PROT_... bits used to create the VMA, but based on the permissions of the file. So as long as the render node is writable, VM_MAYWRITE is set for all VMAs that map it.

I would agree that it's probably a bad idea for the Thunk to map these VMAs with MAP_PRIVATE. We can try changing the Thunk to use MAP_SHARED for these PROT_NONE mappings. But that doesn't change the fact that this kernel patch broke existing usermode.

For the record, changing the Thunk to use MAP_SHARED works.



Yeah, but see the discussion around MAP_PRIVATE and COW regarding this. What Thunk did here was a really bad idea.

Hindsight ... Which part was a bad idea?

 * Creating a PROT_NONE VMA? That's necessary to let ptrace or
   /proc/<pid>/mem access the memory. It's a brilliant idea, IMHO
 * Making that VMA MAP_PRIVATE? Probably a bad idea in hindsight. The
   process itself can't access this memory, let alone write to it. So I
   didn't think too much about whether to make it shared or private.
   Either way, we are not causing any actual COW on these mappings
   because they are not writable, and we never make them writable with
   mprotect either
 * Introducing a COW check after letting it slide for 15 years? It's a
   reasonable check. In this case it catches a false positive. Had this
   check been in place 4 or 5 years ago, it would have easily prevented
   this mess



I think we have only two options here:
1. Accept the breakage of thunk.

Really?


2. Add a workaround in amdgpu/amdkfd to force MAP_PRIVATE into MAP_SHARED in the kernel.

I tried to do this in amdgpu_gem_object_mmap and spent 4 hours debugging why it doesn't work. It breaks because the mapping->i_mmap_writable gets unbalanced and causes subsequent mappings to fail when that atomic counter becomes negative (indicating DENYWRITE). I could fix it by calling mapping_map_writable. But I don't think this is intended as driver API. I see no precedent of any driver calling this. For the record this works, but it feels fragile and ugly:

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -255,6 +255,20 @@ static int amdgpu_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_str
 	if (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)
 		return -EPERM;
+ /* Workaround for Thunk bug creating PROT_NONE,MAP_PRIVATE mappings
+	 * for debugger access to invisible VRAM. Should have used MAP_SHARED
+	 * instead.
+	 */
+	if (is_cow_mapping(vma->vm_flags) &&
+	    !(vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC))) {
+		int ret;
+
+		ret = mapping_map_writable(vma->vm_file->f_mapping);
+		if (ret)
+			return ret;
+		vma->vm_flags |= VM_SHARED | VM_MAYSHARE;
+	}
+
 	return drm_gem_ttm_mmap(obj, vma);
 }

3. Improve my previous workaround by adding a similar check for COW in ttm_bo_vm_ops.mprotect. I'll send an updated patch.

Regards,
  Felix



Regards,
Christian.


Regards,
  Felix



Christian.

          return -EINVAL;
        ttm_bo_get(bo);





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux