On Thu, 15 Sept 2022 at 12:41, Lorenz Bauer <oss@xxxxxx> wrote: > > Hi list, > > Here is a summary of the talk I gave at LPC '22 titled "Closing the BPF map permission loophole", with slides at [0]. > > Problem #1: Read-only fds can be modified via a BPF program > > 1. Craft a BPF program that executes bpf_map_update_elem(read-only fd, ...) > 2. Load the program & execute it > > The reason is that the verifier only checks bpf_map->map_flags in resolve_pseudo_ldimm64, but ignores fd->f_mode. > > Fixing this problem is complicated by the fact that a user may use several distinct fds with differing permissions to refer to the same map, but that the verifier internally only tracks unique struct bpf_map. See [1]. > > Problem #2: Read-only fds can be "transmuted" into read-write fds via map in map > > 1. BPF_MAP_UPDATE_ELEM(map in map fd, read-only fd) > 2. BPF_MAP_LOOKUP_ELEM(map in map fd) = read-write fd > > This was pointed out by Stanislav Fomichev during the LPC session. I've not yet tried this myself. > > Problem #3: Read-only fds can be transmuted into read-write fds via object pinning > > 1. BPF_OBJ_PIN(read-only fd, /sys/fs/bpf/foo) > 2. BPF_OBJ_GET(/sys/fs/bpf/foo) = read-write fd > Just for completeness, this was also pointed out by Andy back in 2019: https://lore.kernel.org/bpf/CALCETrVtPs8gY-H4gmzSqPboid3CB++n50SvYd6RU9YVde_-Ow@xxxxxxxxxxxxxx > The problem is with BPF_OBJ_PIN semantics: regardless of fd->f_mode, pinning creates an inode that is owned by the current user, with mode o=rw. Even if we made the inode o=r, a user / attacker can still use chmod(2) to change it back to o=rw. > > On older kernels, this requires either unprivileged BPF or CAP_BPF, but recently BPF_OBJ_PIN has been made available without CAP_BPF. > > This problem also applies to other BPF objects: links, programs, maybe iterators? Since we don't have BPF_F_RDONLY semantics for those the issue is maybe less urgent, but see [2] for some more fun. > > A number of ideas were explored during the session: > > * In OBJ_PIN, create the inode owned by the user that executed MAP_CREATE, not the user that > invoked OBJ_PIN. This would allow unprivileged users to create files as another user, which > seems like a bad idea. > * In OBJ_GET, refuse a read-write fd if the fd passed to OBJ_PIN wasn't read-write. This is not > possible since we store struct bpf_map * in the inode, so we don't have access to fd->f_mode > anymore. > * In OBJ_PIN, adjust the mode of the created inode to match fd->f_mode, and later refuse attempts > to chmod(2). After a cursory glance at the source code it seems like there are no hooks for > filesystems to influence chmod. > > My gut feeling is that the root of the problem is that OBJ_PIN is too permissive. Once an inode exists that is owned by the current user the cat is out of the box. > > BPF_F_RDONLY and BPF_F_WRONLY were introduced in 4.15 [3]. If we want to fix this properly, aka relying on BPF_R_RDONLY doesn't introduce a gaping hole, we'll have to do quite a bit of backporting. > > I plan on submitting a sledgehammer approach fix for #1 and #2 as discussed with Daniel after my presentation. > > #3 is in sore need of further discussion and creativity. One avenue I want to explore is whether we can refuse OBJ_PIN if: > - the current user is not the map creator > - and the fd is not r/w > - and the current user has no CAP_DAC_OVERRIDE (or similar) > > Thanks > Lorenz > > 0: https://lpc.events/event/16/contributions/1372/attachments/977/2059/Plumbers%2022%20Closing%20the%20BPF%20map%20permission%20loophole.pdf > 1: https://elixir.bootlin.com/linux/v6.0-rc5/source/kernel/bpf/verifier.c#L12839 > 2: https://lore.kernel.org/bpf/20210618105526.265003-1-zenczykowski@xxxxxxxxx/ > 3: https://github.com/torvalds/linux/commit/6e71b04a8224