On Mon, May 11, 2020 at 05:12:10PM -0700, sdf@xxxxxxxxxx wrote: > On 05/08, Alexei Starovoitov wrote: > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > [..] > > @@ -3932,7 +3977,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr > > __user *, uattr, unsigned int, siz > > union bpf_attr attr; > > int err; > > > - if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN)) > > + if (sysctl_unprivileged_bpf_disabled && !bpf_capable()) > > return -EPERM; > This is awesome, thanks for reviving the effort! > > One question I have about this particular snippet: > Does it make sense to drop bpf_capable checks for the operations > that work on a provided fd? Above snippet is for the case when sysctl switches unpriv off. It was a big hammer and stays big hammer. I certainly would like to improve the situation, but I suspect the folks who turn that sysctl knob on are simply paranoid about bpf and no amount of reasoning would turn them around. > The use-case I have in mind is as follows: > * privileged (CAP_BPF) process loads the programs/maps and pins > them at some known location > * unprivileged process opens up those pins and does the following: > * prepares the maps (and will later on read them) > * does SO_ATTACH_BPF/SO_ATTACH_REUSEPORT_EBPF which afaik don't > require any capabilities > > This essentially pushes some of the permission checks into a fs layer. So > whoever has a file descriptor (via unix sock or open) can do BPF operations > on the object that represents it. cap_bpf doesn't change things in that regard. Two cases here: sysctl_unprivileged_bpf_disabled==0: Unpriv can load socket_filter prog type and unpriv can attach it via SO_ATTACH_BPF/SO_ATTACH_REUSEPORT_EBPF. sysctl_unprivileged_bpf_disabled==1: cap_sys_admin can load socket_filter and unpriv can attach it. With addition of cap_bpf in the second case cap_bpf process can load socket_filter too. It doesn't mean that permissions are pushed into fs layer. I'm not sure that relaxing of sysctl_unprivileged_bpf_disabled will be well received. Are you proposing to selectively allow certain bpf syscall commands even when sysctl_unprivileged_bpf_disabled==1 ? Like allow unpriv to do BPF_OBJ_GET to get an fd from bpffs ? And allow unpriv to do map_update ? It makes complete sense to me, but I'd like to argue about that independently from this cap_bpf set. We can relax that sysctl later.