On 3/29/20 8:59 PM, Andrii Nakryiko wrote: > bpf_link abstraction itself was formalized in [0] with justifications for why > its semantics is a good fit for attaching BPF programs of various types. This > patch set adds bpf_link-based BPF program attachment mechanism for cgroup BPF > programs. > > Cgroup BPF link is semantically compatible with current BPF_F_ALLOW_MULTI > semantics of attaching cgroup BPF programs directly. Thus cgroup bpf_link can > co-exist with legacy BPF program multi-attachment. > > bpf_link is destroyed and automatically detached when the last open FD holding > the reference to bpf_link is closed. This means that by default, when the > process that created bpf_link exits, attached BPF program will be > automatically detached due to bpf_link's clean up code. Cgroup bpf_link, like > any other bpf_link, can be pinned in BPF FS and by those means survive the > exit of process that created the link. This is useful in many scenarios to > provide long-living BPF program attachments. Pinning also means that there > could be many owners of bpf_link through independent FDs. > > Additionally, auto-detachmet of cgroup bpf_link is implemented. When cgroup is > dying it will automatically detach all active bpf_links. This ensures that > cgroup clean up is not delayed due to active bpf_link even despite no chance > for any BPF program to be run for a given cgroup. In that sense it's similar > to existing behavior of dropping refcnt of attached bpf_prog. But in the case > of bpf_link, bpf_link is not destroyed and is still available to user as long > as at least one active FD is still open (or if it's pinned in BPF FS). > > There are two main cgroup-specific differences between bpf_link-based and > direct bpf_prog-based attachment. > > First, as opposed to direct bpf_prog attachment, cgroup itself doesn't "own" > bpf_link, which makes it possible to auto-clean up attached bpf_link when user > process abruptly exits without explicitly detaching BPF program. This makes > for a safe default behavior proven in BPF tracing program types. But bpf_link > doesn't bump cgroup->bpf.refcnt as well and because of that doesn't prevent > cgroup from cleaning up its BPF state. > > Second, only owners of bpf_link (those who created bpf_link in the first place > or obtained a new FD by opening bpf_link from BPF FS) can detach and/or update > it. This makes sure that no other process can accidentally remove/replace BPF > program. > > This patch set also implements LINK_UPDATE sub-command, which allows to > replace bpf_link's underlying bpf_prog, similarly to BPF_F_REPLACE flag > behavior for direct bpf_prog cgroup attachment. Similarly to LINK_CREATE, it > is supposed to be generic command for different types of bpf_links. > The observability piece should go in the same release as the feature.