On Mon, Mar 30, 2020 at 7:49 AM David Ahern <dsahern@xxxxxxxxx> wrote: > > On 3/29/20 8:59 PM, Andrii Nakryiko wrote: > > bpf_link abstraction itself was formalized in [0] with justifications for why > > its semantics is a good fit for attaching BPF programs of various types. This > > patch set adds bpf_link-based BPF program attachment mechanism for cgroup BPF > > programs. > > > > Cgroup BPF link is semantically compatible with current BPF_F_ALLOW_MULTI > > semantics of attaching cgroup BPF programs directly. Thus cgroup bpf_link can > > co-exist with legacy BPF program multi-attachment. > > > > bpf_link is destroyed and automatically detached when the last open FD holding > > the reference to bpf_link is closed. This means that by default, when the > > process that created bpf_link exits, attached BPF program will be > > automatically detached due to bpf_link's clean up code. Cgroup bpf_link, like > > any other bpf_link, can be pinned in BPF FS and by those means survive the > > exit of process that created the link. This is useful in many scenarios to > > provide long-living BPF program attachments. Pinning also means that there > > could be many owners of bpf_link through independent FDs. > > > > Additionally, auto-detachmet of cgroup bpf_link is implemented. When cgroup is > > dying it will automatically detach all active bpf_links. This ensures that > > cgroup clean up is not delayed due to active bpf_link even despite no chance > > for any BPF program to be run for a given cgroup. In that sense it's similar > > to existing behavior of dropping refcnt of attached bpf_prog. But in the case > > of bpf_link, bpf_link is not destroyed and is still available to user as long > > as at least one active FD is still open (or if it's pinned in BPF FS). > > > > There are two main cgroup-specific differences between bpf_link-based and > > direct bpf_prog-based attachment. > > > > First, as opposed to direct bpf_prog attachment, cgroup itself doesn't "own" > > bpf_link, which makes it possible to auto-clean up attached bpf_link when user > > process abruptly exits without explicitly detaching BPF program. This makes > > for a safe default behavior proven in BPF tracing program types. But bpf_link > > doesn't bump cgroup->bpf.refcnt as well and because of that doesn't prevent > > cgroup from cleaning up its BPF state. > > > > Second, only owners of bpf_link (those who created bpf_link in the first place > > or obtained a new FD by opening bpf_link from BPF FS) can detach and/or update > > it. This makes sure that no other process can accidentally remove/replace BPF > > program. > > > > This patch set also implements LINK_UPDATE sub-command, which allows to > > replace bpf_link's underlying bpf_prog, similarly to BPF_F_REPLACE flag > > behavior for direct bpf_prog cgroup attachment. Similarly to LINK_CREATE, it > > is supposed to be generic command for different types of bpf_links. > > > > The observability piece should go in the same release as the feature. You mean LINK_QUERY command I mentioned before? Yes, I'm working on adding it next, regardless if this patch set goes in right now or later.