On Wed, May 27, 2020 at 07:08 PM CEST, Jakub Sitnicki wrote: > Add support for bpf() syscall subcommands that operate on > bpf_link (LINK_CREATE, LINK_UPDATE, OBJ_GET_INFO) for attach points tied to > network namespaces (that is flow dissector at the moment). > > Link-based and prog-based attachment can be used interchangeably, but only > one can be in use at a time. Attempts to attach a link when a prog is > already attached directly, and the other way around, will be met with > -EBUSY. > > Attachment of multiple links of same attach type to one netns is not > supported, with the intention to lift it when a use-case presents > itself. Because of that attempts to create a netns link, when one already > exists result in -E2BIG error, signifying that there is no space left for > another attachment. > > Link-based attachments to netns don't keep a netns alive by holding a ref > to it. Instead links get auto-detached from netns when the latter is being > destroyed by a pernet pre_exit callback. > > When auto-detached, link lives in defunct state as long there are open FDs > for it. -ENOLINK is returned if a user tries to update a defunct link. > > Because bpf_link to netns doesn't hold a ref to struct net, special care is > taken when releasing the link. The netns might be getting torn down when > the release function tries to access it to detach the link. > > To ensure the struct net object is alive when release function accesses it > we rely on the fact that cleanup_net(), struct net destructor, calls > synchronize_rcu() after invoking pre_exit callbacks. If auto-detach from > pre_exit happens first, link release will not attempt to access struct net. > > Same applies the other way around, network namespace doesn't keep an > attached link alive because by not holding a ref to it. Instead bpf_links > to netns are RCU-freed, so that pernet pre_exit callback can safely access > and auto-detach the link when racing with link release/free. > > Signed-off-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > --- [...] > +static int bpf_netns_link_update_prog(struct bpf_link *link, > + struct bpf_prog *new_prog, > + struct bpf_prog *old_prog) > +{ > + struct bpf_netns_link *net_link = to_bpf_netns_link(link); > + struct net *net; > + int ret = 0; > + > + if (old_prog && old_prog != link->prog) > + return -EPERM; > + if (new_prog->type != link->prog->type) > + return -EINVAL; > + > + mutex_lock(&netns_bpf_mutex); > + rcu_read_lock(); > + > + net = rcu_dereference(net_link->net); > + if (!net || !check_net(net)) { > + /* Link auto-detached or netns dying */ > + ret = -ENOLINK; > + goto out_unlock; > + } > + > + old_prog = xchg(&link->prog, new_prog); > + bpf_prog_put(old_prog); I've noticed a bug here. I should be updating net->bpf.progs[type] here as well. Will fix in v2. > + > +out_unlock: > + rcu_read_unlock(); > + mutex_unlock(&netns_bpf_mutex); > + > + return ret; > +} [...]