On Thu, Jul 6, 2023 at 4:32 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > > Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> writes: > > > Having it as a separate single-purpose FS seems cleaner, because we > > have use cases where we'd have one BPF FS instance created for a > > container by our container manager, and then exposing a few separate > > tokens with different sets of allowed functionality. E.g., one for > > main intended workload, another for some BPF-based observability > > tools, maybe yet another for more heavy-weight tools like bpftrace for > > extra debugging. In the debugging case our container infrastructure > > will be "evacuating" any other workloads on the same host to avoid > > unnecessary consequences. The point is to not disturb > > workload-under-human-debugging as much as possible, so we'd like to > > keep userns intact, which is why mounting extra (more permissive) BPF > > token inside already running containers is an important consideration. > > This example (as well as Yafang's in the sibling subthread) makes it > even more apparent to me that it would be better with a model where the > userspace policy daemon can just make decisions on each call directly, > instead of mucking about with different tokens with different embedded > permissions. Why not go that route (see my other reply for details on > what I mean)? I don't know how you arrived at this conclusion, but we've debated BPF proxying and separate service at length, there is no point in going on another round here. Per-call decisions can be achieved nicely by employing BPF LSM in a restrictive manner on top of BPF token (or no token, if you are ok without user namespaces). > > -Toke >