Re: [PATCH 2/2] drm/ttm: Fix COW check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 10.07.21 um 02:29 schrieb Felix Kuehling:

On 2021-07-09 3:37 p.m., Christian König wrote:
Am 09.07.21 um 21:31 schrieb Felix Kuehling:
On 2021-07-09 2:38 a.m., Christian König wrote:


Am 08.07.21 um 21:36 schrieb Alex Deucher:
From: Felix Kuehling <Felix.Kuehling@xxxxxxx>

KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
is_cow_mapping returns true for these mappings. Add a check for
vm_flags & VM_WRITE to avoid mmap failures on private read-only or
PROT_NONE mappings.

I'm pretty sure that this is not working as expected.

Not sure what you mean. Debugger access to the memory through the PROT_NONE VMAs is definitely working, with both ptrace and /proc/<pid>/mem.




Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
Signed-off-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
---
drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..a75e90c7d4aa 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -552,7 +552,7 @@ static const struct vm_operations_struct ttm_bo_vm_ops = {   int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object *bo)
  {
      /* Enforce no COW since would have really strange behavior with it. */
-    if (is_cow_mapping(vma->vm_flags))
+    if (is_cow_mapping(vma->vm_flags) && (vma->vm_flags & VM_WRITE))

is_cow_mapping() already checks for VM_MAYWRITE, so this here shouldn't be necessary.

AFAICT, VM_MAYWRITE is not based on the PROT_... bits used to create the VMA, but based on the permissions of the file. So as long as the render node is writable, VM_MAYWRITE is set for all VMAs that map it.

I would agree that it's probably a bad idea for the Thunk to map these VMAs with MAP_PRIVATE. We can try changing the Thunk to use MAP_SHARED for these PROT_NONE mappings. But that doesn't change the fact that this kernel patch broke existing usermode.

For the record, changing the Thunk to use MAP_SHARED works.



Yeah, but see the discussion around MAP_PRIVATE and COW regarding this. What Thunk did here was a really bad idea.

Hindsight ... Which part was a bad idea?

 * Creating a PROT_NONE VMA? That's necessary to let ptrace or
   /proc/<pid>/mem access the memory. It's a brilliant idea, IMHO
 * Making that VMA MAP_PRIVATE? Probably a bad idea in hindsight. The
   process itself can't access this memory, let alone write to it. So I
   didn't think too much about whether to make it shared or private.
   Either way, we are not causing any actual COW on these mappings
   because they are not writable, and we never make them writable with
   mprotect either
 * Introducing a COW check after letting it slide for 15 years? It's a
   reasonable check. In this case it catches a false positive. Had this
   check been in place 4 or 5 years ago, it would have easily prevented
   this mess

A combination of everything.

MAP_PRIVATE would have previously caused very odd behavior and/or a BUG_ON()/WARN_ON() in the VMA code (depending on platform).

MAP_PRIVATE + PROT_NONE kind of worked because it never faulted a page so we never actually triggered a COW.



I think we have only two options here:
1. Accept the breakage of thunk.

Really?

Only as last resort, I'm running out of ideas how to fix this cleanly.



2. Add a workaround in amdgpu/amdkfd to force MAP_PRIVATE into MAP_SHARED in the kernel.

I tried to do this in amdgpu_gem_object_mmap and spent 4 hours debugging why it doesn't work. It breaks because the mapping->i_mmap_writable gets unbalanced and causes subsequent mappings to fail when that atomic counter becomes negative (indicating DENYWRITE). I could fix it by calling mapping_map_writable. But I don't think this is intended as driver API. I see no precedent of any driver calling this. For the record this works, but it feels fragile and ugly:

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -255,6 +255,20 @@ static int amdgpu_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_str
     if (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)
         return -EPERM;

+    /* Workaround for Thunk bug creating PROT_NONE,MAP_PRIVATE mappings
+     * for debugger access to invisible VRAM. Should have used MAP_SHARED
+     * instead.
+     */
+    if (is_cow_mapping(vma->vm_flags) &&
+        !(vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC))) {
+        int ret;
+
+        ret = mapping_map_writable(vma->vm_file->f_mapping);
+        if (ret)
+            return ret;
+        vma->vm_flags |= VM_SHARED | VM_MAYSHARE;
+    }
+

Yeah, I'm on another thread where we are discussing mapping_map_writable() for vma_set_file() as well.

return drm_gem_ttm_mmap(obj, vma);
 }

3. Improve my previous workaround by adding a similar check for COW in ttm_bo_vm_ops.mprotect. I'll send an updated patch.

That has it's own issues, but going to reply there in detail.

Regards,
Christian.


Regards,
  Felix



Regards,
Christian.


Regards,
  Felix



Christian.

          return -EINVAL;
        ttm_bo_get(bo);



_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux