Yafang Shao <laoar.shao@xxxxxxxxx> writes: > Currently only CAP_SYS_ADMIN can iterate BPF object IDs and convert IDs > to FDs, that's intended for BPF's security model[1]. Not only does it > prevent non-privilidged users from getting other users' bpf program, but > also it prevents the user from iterating his own bpf objects. > > In container environment, some users want to run bpf programs in their > containers. These users can run their bpf programs under CAP_BPF and > some other specific CAPs, but they can't inspect their bpf programs in a > generic way. For example, the bpftool can't be used as it requires > CAP_SYS_ADMIN. That is very inconvenient. > > Without CAP_SYS_ADMIN, the only way to get the information of a bpf object > which is not created by the process itself is with SCM_RIGHTS, that > requires each processes which created bpf object has to implement a unix > domain socket to share the fd of a bpf object between different > processes, that is really trivial and troublesome. > > Hence we need a better mechanism to get bpf object info without > CAP_SYS_ADMIN. > > BPF namespace is introduced in this patchset with an attempt to remove > the CAP_SYS_ADMIN requirement. The user can create bpf map, prog and > link in a specific bpf namespace, then these bpf objects will not be > visible to the users in a different bpf namespace. But these bpf > objects are visible to its parent bpf namespace, so the sys admin can > still iterate and inspect them. > > BPF namespace is similar to PID namespace, and the bpf objects are > similar to tasks, so BPF namespace is very easy to understand. These > patchset only implements BPF namespace for bpf map, prog and link. In the > future we may extend it to other bpf objects like btf, bpffs and etc. May? I think we should cover all of the existing BPF objects from the beginning here, or we may miss important interactions that will invalidate the whole idea. In particular, I'm a little worried about the interaction between namespaces and bpffs; what happens if you're in a bpf namespace and you try to read a BPF object from a bpffs that belongs to a different namespace? Does the operation fail? Is the object hidden entirely? Something else? -Toke