Re: [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 29, 2022 at 5:51 PM Carlos Llamas <cmllamas@xxxxxxxxxx> wrote:
>
> Commit c462ac288f2c ("mm: Introduce arch_validate_flags()") added a late
> check in mmap_region() to let architectures validate vm_flags. The check
> needs to happen after calling ->mmap() as the flags can potentially be
> modified during this callback.
>
> If arch_validate_flags() check fails we unmap and free the vma. However,
> the error path fails to undo the ->mmap() call that previously succeeded
> and depending on the specific ->mmap() implementation this translates to
> reference increments, memory allocations and other operations what will
> not be cleaned up.
>
> There are several places (mainly device drivers) where this is an issue.
> However, one specific example is bpf_map_mmap() which keeps count of the
> mappings in map->writecnt. The count is incremented on ->mmap() and then
> decremented on vm_ops->close(). When arch_validate_flags() fails this
> count is off since bpf_map_mmap_close() is never called.
>
> One can reproduce this issue in arm64 devices with MTE support. Here the
> vm_flags are checked to only allow VM_MTE if VM_MTE_ALLOWED has been set
> previously. From userspace then is enough to pass the PROT_MTE flag to
> mmap() syscall to trigger the arch_validate_flags() failure.
>
> The following program reproduces this issue:
> ---
>   #include <stdio.h>
>   #include <unistd.h>
>   #include <linux/unistd.h>
>   #include <linux/bpf.h>
>   #include <sys/mman.h>
>
>   int main(void)
>   {
>         union bpf_attr attr = {
>                 .map_type = BPF_MAP_TYPE_ARRAY,
>                 .key_size = sizeof(int),
>                 .value_size = sizeof(long long),
>                 .max_entries = 256,
>                 .map_flags = BPF_F_MMAPABLE,
>         };
>         int fd;
>
>         fd = syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
>         mmap(NULL, 4096, PROT_WRITE | PROT_MTE, MAP_SHARED, fd, 0);
>
>         return 0;
>   }
> ---
>
> By manually adding some log statements to the vm_ops callbacks we can
> confirm that when passing PROT_MTE to mmap() the map->writecnt is off
> upon ->release():
>
> With PROT_MTE flag:
>   root@debian:~# ./bpf-test
>   [  111.263874] bpf_map_write_active_inc: map=9 writecnt=1
>   [  111.288763] bpf_map_release: map=9 writecnt=1
>
> Without PROT_MTE flag:
>   root@debian:~# ./bpf-test
>   [  157.816912] bpf_map_write_active_inc: map=10 writecnt=1
>   [  157.830442] bpf_map_write_active_dec: map=10 writecnt=0
>   [  157.832396] bpf_map_release: map=10 writecnt=0
>
> This patch fixes the above issue by calling vm_ops->close() when the
> arch_validate_flags() check fails, after this we can proceed to unmap
> and free the vma on the error path.
>
> Fixes: c462ac288f2c ("mm: Introduce arch_validate_flags()")
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Liam Howlett <liam.howlett@xxxxxxxxxx>
> Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx> # v5.10+
> Signed-off-by: Carlos Llamas <cmllamas@xxxxxxxxxx>
> ---

Makes sense to me, open/close callbacks should be symmetrical. From
BPF-side of things:

Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx>

>  mm/mmap.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9d780f415be3..36c08e2c78da 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1797,7 +1797,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>         if (!arch_validate_flags(vma->vm_flags)) {
>                 error = -EINVAL;
>                 if (file)
> -                       goto unmap_and_free_vma;
> +                       goto close_and_free_vma;
>                 else
>                         goto free_vma;
>         }
> @@ -1844,6 +1844,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>
>         return addr;
>
> +close_and_free_vma:
> +       if (vma->vm_ops && vma->vm_ops->close)
> +               vma->vm_ops->close(vma);
>  unmap_and_free_vma:
>         fput(vma->vm_file);
>         vma->vm_file = NULL;
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux