On Thu, Aug 15, 2019 at 10:29 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Thu, Aug 15, 2019 at 11:24:54AM +0000, Jordan Glover wrote: > > systemd --user processes aren't "less privileged". The are COMPLETELY unprivileged. > > Granting them cap_bpf is the same as granting it to every other unprivileged user > > process. Also unprivileged user process can start systemd --user process with any > > command they like. > > systemd itself is trusted. It's the same binary whether it runs as pid=1 > or as pid=123. One of the use cases is to make IPAddressDeny= work with --user. > Subset of that feature already works with AmbientCapabilities=CAP_NET_ADMIN. > CAP_BPF is a natural step in the same direction. > I have the feeling that we're somehow speaking different languages. What, precisely, do you mean when you say "systemd itself is trusted"? Do you mean "the administrator trusts that the /lib/systemd/systemd binary is not malicious"? Do you mean "the administrator trusts that the running systemd process is not malicious"? On a regular Linux desktop or server box, passing CAP_NET_ADMIN, your envisioned CAP_BPF, or /dev/bpf as in this patchset through to a systemd --user binary would be a gaping security hole. You are welcome to do it on your own systemd, but if a distro did it, it would be a major error. If you want IPAddressDeny= to work in a user systemd unit (i.e. /etc/systemd/user/*), then I think you have two choices. You could have an API by which systemd --user can ask a privileged helper to assist (which has all the challenges you mentioned but is definitely *possible*) or the kernel bpf() interfaces need to be designed so that, in the absence of kernel bugs, they are safe to use from an unprivileged process. By "safe", I mean "would not expose the system to attack if the kernel's implementation of the bpf() ABI were perfect". My suggestions upthread for incrementally making bpf() depend less on privilege would accomplish this goal. It would be entirely reasonable to say that, even with those changes, bpf() is still a large attack surface and access to it should be restricted, and having a capability or other mechanism to explicitly grant access to the hopefully-secure-but-plausibly-buggy parts of bpf() would make sense. But you rejected that idea and said you "realized that [changing all the capable() checks is] perfect as-is" without much explanation, which makes it hard to understand where you're coming from.