On Tue, Mar 7, 2023 at 3:33 PM Kui-Feng Lee <kuifeng@xxxxxxxx> wrote: > > BPF struct_ops maps are employed directly to register TCP Congestion > Control algorithms. Unlike other BPF programs that terminate when > their links gone. The link of a BPF struct_ops map provides a uniform > experience akin to other types of BPF programs. > > bpf_links are responsible for registering their associated > struct_ops. You can only use a struct_ops that has the BPF_F_LINK flag > set to create a bpf_link, while a structs without this flag behaves in > the same manner as before and is registered upon updating its value. > > The BPF_LINK_TYPE_STRUCT_OPS serves a dual purpose. Not only is it > used to craft the links for BPF struct_ops programs, but also to > create links for BPF struct_ops them-self. Since the links of BPF > struct_ops programs are only used to create trampolines internally, > they are never seen in other contexts. Thus, they can be reused for > struct_ops themself. > > To maintain a reference to the map supporting this link, we add > bpf_struct_ops_link as an additional type. The pointer of the map is > RCU and won't be necessary until later in the patchset. > > Signed-off-by: Kui-Feng Lee <kuifeng@xxxxxxxx> > --- > include/linux/bpf.h | 11 +++ > include/uapi/linux/bpf.h | 12 +++- > kernel/bpf/bpf_struct_ops.c | 119 +++++++++++++++++++++++++++++++-- > kernel/bpf/syscall.c | 23 ++++--- > tools/include/uapi/linux/bpf.h | 12 +++- > 5 files changed, 163 insertions(+), 14 deletions(-) > [...] > +int bpf_struct_ops_link_create(union bpf_attr *attr) > +{ > + struct bpf_struct_ops_link *link = NULL; > + struct bpf_link_primer link_primer; > + struct bpf_struct_ops_map *st_map; > + struct bpf_map *map; > + int err; > + > + map = bpf_map_get(attr->link_create.map_fd); > + if (!map) > + return -EINVAL; > + > + st_map = (struct bpf_struct_ops_map *)map; > + > + if (map->map_type != BPF_MAP_TYPE_STRUCT_OPS || !(map->map_flags & BPF_F_LINK) || > + /* Pair with smp_store_release() during map_update */ > + smp_load_acquire(&st_map->kvalue.state) != BPF_STRUCT_OPS_STATE_READY) { > + err = -EINVAL; > + goto err_out; > + } > + > + link = kzalloc(sizeof(*link), GFP_USER); > + if (!link) { > + err = -ENOMEM; > + goto err_out; > + } > + bpf_link_init(&link->link, BPF_LINK_TYPE_STRUCT_OPS, &bpf_struct_ops_map_lops, NULL); > + RCU_INIT_POINTER(link->map, map); > + > + err = bpf_link_prime(&link->link, &link_primer); > + if (err) > + goto err_out; > + > + err = st_map->st_ops->reg(st_map->kvalue.data); > + if (err) { > + bpf_link_cleanup(&link_primer); link = NULL to avoid kfree()-ing it, see bpf_tracing_prog_attach() for similar approach > + goto err_out; > + } > + > + return bpf_link_settle(&link_primer); > + > +err_out: > + bpf_map_put(map); > + kfree(link); > + return err; > +} > + [...]