On Thu, 15 Sep 2022, at 11:30, Lorenz Bauer wrote: > Hi list, > > Here is a summary of the talk I gave at LPC '22 titled "Closing the BPF > map permission loophole", with slides at [0]. A timeline of the most important commits in this sorry affair. TL;DR: >= 4.15 is really broken. v4.4 commit b2197755b263 ("bpf: add support for persistent maps/progs") https://github.com/torvalds/linux/commit/b2197755b263 Introduces BPF_OBJ_PIN and OBJ_GET. Map fds are always read-write and all is fine. v4.12 commit 56f668dfe00d ("bpf: Add array of maps support") https://github.com/torvalds/linux/commit/56f668dfe00d Map fds are always read-write. Inner map map_flags are compared to the outer map via bpf_map_meta_equal, which prevents stripping BPF_F_RDONLY_PROG, etc. later on. v4.15 commit 6e71b04a8224 ("bpf: Add file mode configuration into bpf maps") https://github.com/torvalds/linux/commit/6e71b04a8224 Introduces BPF_F_RDONLY / WRONLY. Doesn't take into account that BPF programs can modify maps (unprivileged BPF is enabled since v4.3). Also doesn't take into account that BPF_OBJ_PIN can be used to turn r/o into r/w maps. This is the start of our problems, and IMO it's completely broken. v5.2 commit 591fe9888d78 ("bpf: add program side {rd, wr}only support for maps") https://github.com/torvalds/linux/commit/591fe9888d78 Introduces BPF_F_RDONLY_PROG, etc. Using sourcegraph.com I found a BPF ELF that creates a map with BPF_F_RDONLY | BPF_F_RDONLY_PROG flags [0]. This would be broken by refusing non-RW fds in the verifier, which is my preferred brute fix. Those flags came about because of a misunderstanding of what BPF_F_RDONLY does, which makes me worry might be other cases out there. Based on my research, we can do the following: > Problem #1: Read-only fds can be modified via a BPF program Up until 591fe9888d78 BPF_F_RDONLY_PROG we'd probably be OK with just refusing !rw fds. The only kernel.org version that this applies to is 4.19. 5.4, 5.10, 5.15 need the fix Roberto suggested. > Problem #2: Read-only fds can be "transmuted" into read-write fds via map in map This is not as big a problem as I initially thought. First, lookup in a nested map from userspace returns an ID, not a fd. Going from ID to fd requires CAP_SYS_ADMIN. On the program side, it's possible to retrieve the inner map and modify it. I think it's possible to fix this by refusing to insert !rw inner maps, by patching bpf_map_fd_get_ptr(). That function hasn't changed in a long time either. 4.19, 5.4, 5.10, 5.15. > Problem #3: Read-only fds can be transmuted into read-write fds via > object pinning > > 1. BPF_OBJ_PIN(read-only fd, /sys/fs/bpf/foo) > 2. BPF_OBJ_GET(/sys/fs/bpf/foo) = read-write fd bpf_map doesn't have a concept of an owning user, so the only solution I can come up with is to refuse OBJ_PIN if !rw and !capable(CAP_DAC_OVERRIDE). Replace CAP_DAC_OVERRIDE with another capability of your choice, the idea being that this allows programs that run as root (probably a lot?) to continue functioning. Again 4.19, 5.4, 5.10, 5.15. Daniel, Alexei, Andrii: any thoughts? This is a pretty deep rabbit hole, and I don't want to waste my time on the wrong approach. 0: https://github.com/willfindlay/bpfcontain-rs/blob/eb2cd826b609e165d63d784c0f562b7a278171d2/src/bpf/include/allocator.h#L16