Re: [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/27/23 16:14, Jiri Slaby wrote:
On 16. 02. 23, 9:09, Helge Deller wrote:
Some architectures have memory cache aliasing requirements (e.g. parisc)
if memory is shared between userspace and kernel. This patch fixes the
kernel to return an aliased address when asked by userspace via mmap().

Signed-off-by: Helge Deller <deller@xxxxxx>
---
v2: Do not allow to map to a user-provided addresss. This forces
programs to write portable code, as usually on x86 mapping to any
address will succeed, while it will fail for most provided address if
used on stricter architectures.

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 862e05e6691d..01fe7437a071 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -72,6 +72,7 @@
  #include <linux/io_uring.h>
  #include <linux/audit.h>
  #include <linux/security.h>
+#include <asm/shmparam.h>

  #define CREATE_TRACE_POINTS
  #include <trace/events/io_uring.h>
@@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
      return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
  }

+static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
+            unsigned long addr, unsigned long len,
+            unsigned long pgoff, unsigned long flags)
+{
+    const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
+    struct vm_unmapped_area_info info;
+    void *ptr;
+
+    /*
+     * Do not allow to map to user-provided address to avoid breaking the
+     * aliasing rules. Userspace is not able to guess the offset address of
+     * kernel kmalloc()ed memory area.
+     */
+    if (addr)
+        return -EINVAL;
+
+    ptr = io_uring_validate_mmap_request(filp, pgoff, len);
+    if (IS_ERR(ptr))
+        return -ENOMEM;
+
+    info.flags = VM_UNMAPPED_AREA_TOPDOWN;
+    info.length = len;
+    info.low_limit = max(PAGE_SIZE, mmap_min_addr);
+    info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);

Hi,

this breaks compat (x86_32) on x86_64 in 6.4. When you run most liburing tests, you'll get ENOMEM, as this high_limit is something in 64-bit space...

+#ifdef SHM_COLOUR
+    info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL);
+#else
+    info.align_mask = PAGE_MASK & (SHMLBA - 1UL);
+#endif
+    info.align_offset = (unsigned long) ptr;
+
+    /*
+     * A failed mmap() very likely causes application failure,
+     * so fall back to the bottom-up function here. This scenario
+     * can happen with large stack limits and large mmap()
+     * allocations.
+     */
+    addr = vm_unmapped_area(&info);

So the found addr here is > TASK_SIZE - len for 32-bit bins. And get_unmapped_area() returns ENOMEM.

+    if (offset_in_page(addr)) {
+        info.flags = 0;
+        info.low_limit = TASK_UNMAPPED_BASE;
+        info.high_limit = mmap_end;
+        addr = vm_unmapped_area(&info);
+    }
+
+    return addr;
+}

Reverting the whole commit helps of course. Even this completely incorrect hack helps:
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3398,7 +3398,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
                         unsigned long addr, unsigned long len,
                         unsigned long pgoff, unsigned long flags)
  {
-       const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
+       const unsigned long mmap_end = in_32bit_syscall() ? task_size_32bit() : arch_get_mmap_end(addr, len, flags);
         struct vm_unmapped_area_info info;
         void *ptr;

@@ -3417,7 +3417,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
         info.flags = VM_UNMAPPED_AREA_TOPDOWN;
         info.length = len;
         info.low_limit = max(PAGE_SIZE, mmap_min_addr);
-       info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);
+       info.high_limit = in_32bit_syscall() ? task_size_32bit() : arch_get_mmap_base(addr, current->mm->mmap_base);

I think your "incorrect hack" is actually correct.
If it's the compat case which breaks, then task_size_32bit() might be right.
Maybe adding arch_get_mmap_base() and arch_get_mmap_end() macros to handle
the compat case in to arch/x86/include/asm/* does work, e.g.
#define arch_get_mmap_base(addr, base) \
	(in_32bit_syscall() ? task_size_32bit() : base)

?
Helge




[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux