On 5/8/2020 2:53 PM, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > v4->v5: > > Split BPF operations that are allowed under CAP_SYS_ADMIN into combination of > CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN and keep some of them under CAP_SYS_ADMIN. > > The user process has to have > - CAP_BPF and CAP_PERFMON to load tracing programs. > - CAP_BPF and CAP_NET_ADMIN to load networking programs. > (or CAP_SYS_ADMIN for backward compatibility). Is there a case where CAP_BPF is useful in the absence of other capabilities? I generally object to new capabilities in cases where existing capabilities are already required. > > CAP_BPF solves three main goals: > 1. provides isolation to user space processes that drop CAP_SYS_ADMIN and switch to CAP_BPF. > More on this below. This is the major difference vs v4 set back from Sep 2019. > 2. makes networking BPF progs more secure, since CAP_BPF + CAP_NET_ADMIN > prevents pointer leaks and arbitrary kernel memory access. > 3. enables fuzzers to exercise all of the verifier logic. Eventually finding bugs > and making BPF infra more secure. Currently fuzzers run in unpriv. > They will be able to run with CAP_BPF. > > The patchset is long overdue follow-up from the last plumbers conference. > Comparing to what was discussed at LPC the CAP* checks at attach time are gone. > For tracing progs the CAP_SYS_ADMIN check was done at load time only. There was > no check at attach time. For networking and cgroup progs CAP_SYS_ADMIN was > required at load time and CAP_NET_ADMIN at attach time, but there are several > ways to bypass CAP_NET_ADMIN: > - if networking prog is using tail_call writing FD into prog_array will > effectively attach it, but bpf_map_update_elem is an unprivileged operation. > - freplace prog with CAP_SYS_ADMIN can replace networking prog > > Consolidating all CAP checks at load time makes security model similar to > open() syscall. Once the user got an FD it can do everything with it. > read/write/poll don't check permissions. The same way when bpf_prog_load > command returns an FD the user can do everything (including attaching, > detaching, and bpf_test_run). > > The important design decision is to allow ID->FD transition for > CAP_SYS_ADMIN only. What it means that user processes can run > with CAP_BPF and CAP_NET_ADMIN and they will not be able to affect each > other unless they pass FDs via scm_rights or via pinning in bpffs. > ID->FD is a mechanism for human override and introspection. > An admin can do 'sudo bpftool prog ...'. It's possible to enforce via LSM that > only bpftool binary does bpf syscall with CAP_SYS_ADMIN and the rest of user > space processes do bpf syscall with CAP_BPF isolating bpf objects (progs, maps, > links) that are owned by such processes from each other. > > Another significant change from LPC is that the verifier checks are split into > allow_ptr_leaks and bpf_capable flags. The allow_ptr_leaks disables spectre > defense and allows pointer manipulations while bpf_capable enables all modern > verifier features like bpf-to-bpf calls, BTF, bounded loops, indirect stack > access, dead code elimination, etc. All the goodness. > These flags are initialized as: > env->allow_ptr_leaks = perfmon_capable(); > env->bpf_capable = bpf_capable(); > That allows networking progs with CAP_BPF + CAP_NET_ADMIN enjoy modern > verifier features while being more secure. > > Some networking progs may need CAP_BPF + CAP_NET_ADMIN + CAP_PERFMON, > since subtracting pointers (like skb->data_end - skb->data) is a pointer leak, > but the verifier may get smarter in the future. > > Please see patches for more details. > > Alexei Starovoitov (3): > bpf, capability: Introduce CAP_BPF > bpf: implement CAP_BPF > selftests/bpf: use CAP_BPF and CAP_PERFMON in tests > > drivers/media/rc/bpf-lirc.c | 2 +- > include/linux/bpf_verifier.h | 1 + > include/linux/capability.h | 5 ++ > include/uapi/linux/capability.h | 34 +++++++- > kernel/bpf/arraymap.c | 2 +- > kernel/bpf/bpf_struct_ops.c | 2 +- > kernel/bpf/core.c | 4 +- > kernel/bpf/cpumap.c | 2 +- > kernel/bpf/hashtab.c | 4 +- > kernel/bpf/helpers.c | 4 +- > kernel/bpf/lpm_trie.c | 2 +- > kernel/bpf/queue_stack_maps.c | 2 +- > kernel/bpf/reuseport_array.c | 2 +- > kernel/bpf/stackmap.c | 2 +- > kernel/bpf/syscall.c | 87 ++++++++++++++----- > kernel/bpf/verifier.c | 24 ++--- > kernel/trace/bpf_trace.c | 3 + > net/core/bpf_sk_storage.c | 4 +- > net/core/filter.c | 4 +- > security/selinux/include/classmap.h | 4 +- > tools/testing/selftests/bpf/test_verifier.c | 44 ++++++++-- > tools/testing/selftests/bpf/verifier/calls.c | 16 ++-- > .../selftests/bpf/verifier/dead_code.c | 10 +-- > 23 files changed, 191 insertions(+), 73 deletions(-) >