On Wed, Apr 5, 2023 at 10:44 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > On Thu, Apr 6, 2023 at 12:24 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Wed, Apr 5, 2023 at 8:22 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > > > On Thu, Apr 6, 2023 at 11:06 AM Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Wed, Apr 5, 2023 at 7:55 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > > > > > > > It seems that I didn't describe the issue clearly. > > > > > The container doesn't have CAP_SYS_ADMIN, but the CAP_SYS_ADMIN is > > > > > required to run bpftool, so the bpftool running in the container > > > > > can't get the ID of bpf objects or convert IDs to FDs. > > > > > Is there something that I missed ? > > > > > > > > Nothing. This is by design. bpftool needs sudo. That's all. > > > > > > > > > > Hmm, what I'm trying to do is make bpftool run without sudo. > > > > This is not a task that is worth solving. > > > > Then the container with CAP_BPF enabled can't even iterate its bpf progs ... I'll leave the BPF namespace discussion aside (I agree that it needs way more thought). I am a bit surprised that we require CAP_SYS_ADMIN for GET_NEXT_ID operations. GET_FD_BY_ID is definitely CAP_SYS_ADMIN, as they allow you to take over someone else's link and stuff like this. But just iterating IDs seems like a pretty innocent functionality, so maybe we should remove CAP_SYS_ADMIN for GET_NEXT_ID? By itself GET_NEXT_ID is relatively useless without capabilities, but we've been floating the idea of providing GET_INFO_BY_ID (not by FD) for a while now, and that seems useful in itself, as it would indeed help tools like bpftool to get *some* information even without privileges. Whether those GET_INFO_BY_ID operations should return same full bpf_{prog,map,link,btf}_info or some trimmed down version of them would be up to discussion, but I think getting some info without creating an FD seems useful in itself. Would it be worth discussing and solving this separately from namespacing issues? > > > > > > Some questions, > > > > > - What if the process exits after attaching the bpf prog and the prog > > > > > is not auto-detachable? > > > > > For example, the reuserport bpf prog is not auto-detachable. After > > > > > pins the reuserport bpf prog, a task can attach it through the pinned > > > > > bpf file, but if the task forgets to detach it and the pinned file is > > > > > removed, then it seems there's no way to figure out which task or > > > > > cgroup this prog belongs to... > > > > > > > > you're saying that there is a bpf prog in the kernel without > > > > corresponding user space ? > > > > > > No, it is corresponding to user space. For example, it may be > > > corresponding to a socket fd, or a cgroup fd. > > > > > > > Meaning no user space process has an FD > > > > that points to this prog or FD to a map that this prog is using? > > > > In such a case this is truly kernel bpf prog. It doesn't belong to cgroup. > > > > > > > > > > Even if it is kernel bpf prog, it is created by a process. The user > > > needs to know which one created it. > > > > In some situations it's certainly interesting to know which process > > loaded a particular program. > > In many other situations it's irrelevant. > > For example, the process that loaded a prog could have been moved to a > > different cgroup. > > If you want to track the loading you need to install bpf_lsm > > that monitors prog_load hook and collect that info. > > It's not the job of the kernel to do it. > > > > Agreed with you that we can add lots of hooks to track every detail of > the operations. > But it is not free. More hooks, more overhead. > If we can change the kernel to make it lightweight, why not... > > > > > > - Could you pls. explain in detail how to get comm, pid, or cgroup > > > > > from a pinned bpffs file? > > > > > > > > pinned bpf prog and no user space holds FD to it? > > > > It's not part of any cgroup. Nothing to print. > > > > > > As I explained above, even if it holds nothing, the user needs to know > > > the information from it. For example, if it is expected, which one > > > created it? > > > > See the answer above. The kernel has enough hooks already to provide > > this information to user space. No kernel changes necessary. > > > > -- > Regards > Yafang