On Fri, Mar 20, 2020 at 5:19 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Fri, Mar 20, 2020 at 4:46 PM Andrey Ignatov <rdna@xxxxxx> wrote: > > > > Andrii Nakryiko <andriin@xxxxxx> [Fri, 2020-03-20 13:37 -0700]: > > > Implement new sub-command to attach cgroup BPF programs and return FD-based > > > bpf_link back on success. bpf_link, once attached to cgroup, cannot be > > > replaced, except by owner having its FD. cgroup bpf_link has semantics of > > > BPF_F_ALLOW_MULTI BPF program attachments and can co-exist with > > > > Hi Andrii, > > > > Is there any reason to limit it to only BPF_F_ALLOW_MULTI? > > > > No technical reason, just felt like the good default behavior. It's > possible to support all the same flags as with legacy BPF program > attachment, but see below... So I went ahead and just added support for all the modes, it's very minor change in kernel itself, but needs a lot more selftesting logic, given all the new modes. Once I finish with that, I'll post v2 that also fixes build with !CONFIG_CGROUP_BPF. We can continue discussing orthogonal inheritance policies independently, if there is any intereset. > > > The thing is BPF_F_ALLOW_MULTI not only allows to attach multiple > > programs to specified cgroup but also controls what programs can later > > be attached to a sub-cgroup, and in BPF_F_ALLOW_MULTI case both > > sub-cgroup programs and specified cgroup programs will be executed (in > > this order). > > > > There many use-cases though when it's desired to either completely > > disallow attaching programs to a sub-cgroup or override parent cgroup's > > program behavior in sub-cgroup. If bpf_link covers only > > BPF_F_ALLOW_MULTI, those scenarios won't be able to leverage it. > > > > This double-purpose of BPF_F_ALLOW_MULTI is a pain ... For example if > > Yeah, exactly. I don't know historical reasons for why it was done the > way it was done (i.e., BPF_F_ALLOW_MULTI and BPF_F_ALLOW_OVERRIDE > flags), so maybe someone can give some insights there. But I wonder if > inheritance policy should be orthogonal to a single vs multiple > bpf_progs limit for a given cgroup. They could be specified on > per-cgroup level, not per-BPF program (and then making sure flags > match). That way it would be possible to express more nuanced > policies, like allowing multiple programs for a root cgroup, but > disallowing attach new BPF programs for any child cgroup, etc. > > Would be good to get some more perspectives on this... > > > one wants to attach multiple programs to a cgroup but disallow attaching > > programs to a sub-cgroup it's currently impossible (well, w/o additional > > cgroup level just for this). > > > > > non-bpf_link-based BPF cgroup attachments. > > > > > > To prevent bpf_cgroup_link from keeping cgroup alive past the point when no > > > BPF program can be executed, implement auto-detachment of link. When > > > cgroup_bpf_release() is called, all attached bpf_links are forced to release > > > cgroup refcounts, but they leave bpf_link otherwise active and allocated, as > > > well as still owning underlying bpf_prog. This is because user-space might > > > still have FDs open and active, so bpf_link as a user-referenced object can't > > > be freed yet. Once last active FD is closed, bpf_link will be freed and > > > underlying bpf_prog refcount will be dropped. But cgroup refcount won't be > > > touched, because cgroup is released already. > > > > > > The inherent race between bpf_cgroup_link release (from closing last FD) and > > > cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So > > > the only additional check required is when bpf_cgroup_link attempts to detach > > > itself from cgroup. At that time we need to check whether there is still > > > cgroup associated with that link. And if not, exit with success, because > > > bpf_cgroup_link was already successfully detached. > > > > > > Acked-by: Roman Gushchin <guro@xxxxxx> > > > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> > > > --- > > > include/linux/bpf-cgroup.h | 27 ++- > > > include/linux/bpf.h | 10 +- > > > include/uapi/linux/bpf.h | 9 +- > > > kernel/bpf/cgroup.c | 313 +++++++++++++++++++++++++-------- > > > kernel/bpf/syscall.c | 62 +++++-- > > > kernel/cgroup/cgroup.c | 14 +- > > > tools/include/uapi/linux/bpf.h | 9 +- > > > 7 files changed, 345 insertions(+), 99 deletions(-) > > > > > [...]