Patch "bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic" has been added to the 6.13-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic

to the 6.13-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-unify-vm_write-vs-vm_maywrite-use-in-bpf-map-mma.patch
and it can be found in the queue-6.13 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit eade031016822b7ad80c1e9a2a57a2a696914dd4
Author: Andrii Nakryiko <andrii@xxxxxxxxxx>
Date:   Tue Jan 28 17:22:45 2025 -0800

    bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic
    
    [ Upstream commit 98671a0fd1f14e4a518ee06b19037c20014900eb ]
    
    For all BPF maps we ensure that VM_MAYWRITE is cleared when
    memory-mapping BPF map contents as initially read-only VMA. This is
    because in some cases BPF verifier relies on the underlying data to not
    be modified afterwards by user space, so once something is mapped
    read-only, it shouldn't be re-mmap'ed as read-write.
    
    As such, it's not necessary to check VM_MAYWRITE in bpf_map_mmap() and
    map->ops->map_mmap() callbacks: VM_WRITE should be consistently set for
    read-write mappings, and if VM_WRITE is not set, there is no way for
    user space to upgrade read-only mapping to read-write one.
    
    This patch cleans up this VM_WRITE vs VM_MAYWRITE handling within
    bpf_map_mmap(), which is an entry point for any BPF map mmap()-ing
    logic. We also drop unnecessary sanitization of VM_MAYWRITE in BPF
    ringbuf's map_mmap() callback implementation, as it is already performed
    by common code in bpf_map_mmap().
    
    Note, though, that in bpf_map_mmap_{open,close}() callbacks we can't
    drop VM_MAYWRITE use, because it's possible (and is outside of
    subsystem's control) to have initially read-write memory mapping, which
    is subsequently dropped to read-only by user space through mprotect().
    In such case, from BPF verifier POV it's read-write data throughout the
    lifetime of BPF map, and is counted as "active writer".
    
    But its VMAs will start out as VM_WRITE|VM_MAYWRITE, then mprotect() can
    change it to just VM_MAYWRITE (and no VM_WRITE), so when its finally
    munmap()'ed and bpf_map_mmap_close() is called, vm_flags will be just
    VM_MAYWRITE, but we still need to decrement active writer count with
    bpf_map_write_active_dec() as it's still considered to be a read-write
    mapping by the rest of BPF subsystem.
    
    Similar reasoning applies to bpf_map_mmap_open(), which is called
    whenever mmap(), munmap(), and/or mprotect() forces mm subsystem to
    split original VMA into multiple discontiguous VMAs.
    
    Memory-mapping handling is a bit tricky, yes.
    
    Cc: Jann Horn <jannh@xxxxxxxxxx>
    Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx>
    Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx>
    Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
    Link: https://lore.kernel.org/r/20250129012246.1515826-1-andrii@xxxxxxxxxx
    Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
    Stable-dep-of: bc27c52eea18 ("bpf: avoid holding freeze_mutex during mmap operation")
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
index e1cfe890e0be6..1499d8caa9a35 100644
--- a/kernel/bpf/ringbuf.c
+++ b/kernel/bpf/ringbuf.c
@@ -268,8 +268,6 @@ static int ringbuf_map_mmap_kern(struct bpf_map *map, struct vm_area_struct *vma
 		/* allow writable mapping for the consumer_pos only */
 		if (vma->vm_pgoff != 0 || vma->vm_end - vma->vm_start != PAGE_SIZE)
 			return -EPERM;
-	} else {
-		vm_flags_clear(vma, VM_MAYWRITE);
 	}
 	/* remap_vmalloc_range() checks size and offset constraints */
 	return remap_vmalloc_range(vma, rb_map->rb,
@@ -289,8 +287,6 @@ static int ringbuf_map_mmap_user(struct bpf_map *map, struct vm_area_struct *vma
 			 * position, and the ring buffer data itself.
 			 */
 			return -EPERM;
-	} else {
-		vm_flags_clear(vma, VM_MAYWRITE);
 	}
 	/* remap_vmalloc_range() checks size and offset constraints */
 	return remap_vmalloc_range(vma, rb_map->rb, vma->vm_pgoff + RINGBUF_PGOFF);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 5684e8ce132d5..60417b79639e5 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1061,15 +1061,21 @@ static int bpf_map_mmap(struct file *filp, struct vm_area_struct *vma)
 	vma->vm_ops = &bpf_map_default_vmops;
 	vma->vm_private_data = map;
 	vm_flags_clear(vma, VM_MAYEXEC);
+	/* If mapping is read-only, then disallow potentially re-mapping with
+	 * PROT_WRITE by dropping VM_MAYWRITE flag. This VM_MAYWRITE clearing
+	 * means that as far as BPF map's memory-mapped VMAs are concerned,
+	 * VM_WRITE and VM_MAYWRITE and equivalent, if one of them is set,
+	 * both should be set, so we can forget about VM_MAYWRITE and always
+	 * check just VM_WRITE
+	 */
 	if (!(vma->vm_flags & VM_WRITE))
-		/* disallow re-mapping with PROT_WRITE */
 		vm_flags_clear(vma, VM_MAYWRITE);
 
 	err = map->ops->map_mmap(map, vma);
 	if (err)
 		goto out;
 
-	if (vma->vm_flags & VM_MAYWRITE)
+	if (vma->vm_flags & VM_WRITE)
 		bpf_map_write_active_inc(map);
 out:
 	mutex_unlock(&map->freeze_mutex);




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux