Re: Closing the BPF map permission loophole

Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> · Fri, 16 Sep 2022 05:03:32 +0200



On Thu, 15 Sept 2022 at 12:41, Lorenz Bauer <oss@xxxxxx> wrote:
>
> Hi list,
>
> Here is a summary of the talk I gave at LPC '22 titled "Closing the BPF map permission loophole", with slides at [0].
>
> Problem #1: Read-only fds can be modified via a BPF program
>
> 1. Craft a BPF program that executes bpf_map_update_elem(read-only fd, ...)
> 2. Load the program & execute it
>
> The reason is that the verifier only checks bpf_map->map_flags in resolve_pseudo_ldimm64, but ignores fd->f_mode.
>
> Fixing this problem is complicated by the fact that a user may use several distinct fds with differing permissions to refer to the same map, but that the verifier internally only tracks unique struct bpf_map. See [1].
>
> Problem #2: Read-only fds can be "transmuted" into read-write fds via map in map
>
> 1. BPF_MAP_UPDATE_ELEM(map in map fd, read-only fd)
> 2. BPF_MAP_LOOKUP_ELEM(map in map fd) = read-write fd
>
> This was pointed out by Stanislav Fomichev during the LPC session. I've not yet tried this myself.
>
> Problem #3: Read-only fds can be transmuted into read-write fds via object pinning
>
> 1. BPF_OBJ_PIN(read-only fd, /sys/fs/bpf/foo)
> 2. BPF_OBJ_GET(/sys/fs/bpf/foo) = read-write fd
>

Just for completeness, this was also pointed out by Andy back in 2019:
https://lore.kernel.org/bpf/CALCETrVtPs8gY-H4gmzSqPboid3CB++n50SvYd6RU9YVde_-Ow@xxxxxxxxxxxxxx

> The problem is with BPF_OBJ_PIN semantics: regardless of fd->f_mode, pinning creates an inode that is owned by the current user, with mode o=rw. Even if we made the inode o=r, a user / attacker can still use chmod(2) to change it back to o=rw.
>
> On older kernels, this requires either unprivileged BPF or CAP_BPF, but recently BPF_OBJ_PIN has been made available without CAP_BPF.
>
> This problem also applies to other BPF objects: links, programs, maybe iterators? Since we don't have BPF_F_RDONLY semantics for those the issue is maybe less urgent, but see [2] for some more fun.
>
> A number of ideas were explored during the session:
>
> * In OBJ_PIN, create the inode owned by the user that executed MAP_CREATE, not the user that
>   invoked OBJ_PIN. This would allow unprivileged users to create files as another user, which
>   seems like a bad idea.
> * In OBJ_GET, refuse a read-write fd if the fd passed to OBJ_PIN wasn't read-write. This is not
>   possible since we store struct bpf_map * in the inode, so we don't have access to fd->f_mode
>   anymore.
> * In OBJ_PIN, adjust the mode of the created inode to match fd->f_mode, and later refuse attempts
>   to chmod(2). After a cursory glance at the source code it seems like there are no hooks for
>   filesystems to influence chmod.
>
> My gut feeling is that the root of the problem is that OBJ_PIN is too permissive. Once an inode exists that is owned by the current user the cat is out of the box.
>
> BPF_F_RDONLY and BPF_F_WRONLY were introduced in 4.15 [3]. If we want to fix this properly, aka relying on BPF_R_RDONLY doesn't introduce a gaping hole, we'll have to do quite a bit of backporting.
>
> I plan on submitting a sledgehammer approach fix for #1 and #2 as discussed with Daniel after my presentation.
>
> #3 is in sore need of further discussion and creativity. One avenue I want to explore is whether we can refuse OBJ_PIN if:
> - the current user is not the map creator
> - and the fd is not r/w
> - and the current user has no CAP_DAC_OVERRIDE (or similar)
>
> Thanks
> Lorenz
>
> 0: https://lpc.events/event/16/contributions/1372/attachments/977/2059/Plumbers%2022%20Closing%20the%20BPF%20map%20permission%20loophole.pdf
> 1: https://elixir.bootlin.com/linux/v6.0-rc5/source/kernel/bpf/verifier.c#L12839
> 2: https://lore.kernel.org/bpf/20210618105526.265003-1-zenczykowski@xxxxxxxxx/
> 3: https://github.com/torvalds/linux/commit/6e71b04a8224