On Wed, May 29, 2024 at 3:38 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 5/29/24 8:04 AM, Kuifeng Lee wrote: > > On Tue, May 28, 2024 at 11:17 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > >> > >> On 5/24/24 3:30 PM, Kui-Feng Lee wrote: > >>> +static int bpf_struct_ops_map_link_detach(struct bpf_link *link) > >>> +{ > >>> + struct bpf_struct_ops_link *st_link = container_of(link, struct bpf_struct_ops_link, link); > >>> + struct bpf_struct_ops_map *st_map; > >>> + struct bpf_map *map; > >>> + > >>> + mutex_lock(&update_mutex); > >> > >> update_mutex is needed to detach. > >> > >>> + > >>> + map = rcu_dereference_protected(st_link->map, lockdep_is_held(&update_mutex)); > >>> + if (!map) { > >>> + mutex_unlock(&update_mutex); > >>> + return 0; > >>> + } > >>> + st_map = container_of(map, struct bpf_struct_ops_map, map); > >>> + > >>> + st_map->st_ops_desc->st_ops->unreg(&st_map->kvalue.data, link); > >>> + > >>> + rcu_assign_pointer(st_link->map, NULL); > >>> + /* Pair with bpf_map_get() in bpf_struct_ops_link_create() or > >>> + * bpf_map_inc() in bpf_struct_ops_map_link_update(). > >>> + */ > >>> + bpf_map_put(&st_map->map); > >>> + > >>> + mutex_unlock(&update_mutex); > >>> + > >>> + return 0; > >>> +} > >>> + > >>> static const struct bpf_link_ops bpf_struct_ops_map_lops = { > >>> .dealloc = bpf_struct_ops_map_link_dealloc, > >>> + .detach = bpf_struct_ops_map_link_detach, > >>> .show_fdinfo = bpf_struct_ops_map_link_show_fdinfo, > >>> .fill_link_info = bpf_struct_ops_map_link_fill_link_info, > >>> .update_map = bpf_struct_ops_map_link_update, > >>> @@ -1176,13 +1208,22 @@ int bpf_struct_ops_link_create(union bpf_attr *attr) > >>> if (err) > >>> goto err_out; > >>> > >>> + /* Init link->map before calling reg() in case being detached > >>> + * immediately. > >>> + */ > >> > >> With update_mutex held in link_create here, the parallel detach can still happen > >> before the link is fully initialized (the link->map pointer here in particular)? > >> > >>> + RCU_INIT_POINTER(link->map, map); > >>> + > >>> + mutex_lock(&update_mutex); > >>> err = st_map->st_ops_desc->st_ops->reg(st_map->kvalue.data, &link->link); > >>> if (err) { > >>> + RCU_INIT_POINTER(link->map, NULL); > >> > >> I was hoping by holding the the update_mutex, it can avoid this link->map init > >> dance, like RCU_INIT_POINTER(link->map, map) above and then resetting here on > >> the error case. > >> > >>> + mutex_unlock(&update_mutex); > >>> bpf_link_cleanup(&link_primer); > >>> + /* The link has been free by bpf_link_cleanup() */ > >>> link = NULL; > >>> goto err_out; > >>> } > >>> - RCU_INIT_POINTER(link->map, map); > >> > >> If only init link->map once here like the existing code (and the init is > >> protected by the update_mutex), the subsystem should not be able to detach until > >> the link->map is fully initialized. > >> > >> or I am missing something obvious. Can you explain why this link->map init dance > >> is still needed? > > > > Ok, I get what you mean. > > > > I will move RCU_INIT_POINTER() back to its original place, and move the check > > on the value of "err" to the place after mutext_unlock(). > The RCU_INIT_POINTER(link->map, map) needs to be done with update_mutex held and > it should be init after the err check, so the err check needs to be inside > update_mutex lock also. > > Something like this (untested): > > mutex_lock(&update_mutex); > > err = st_map->st_ops_desc->st_ops->reg(st_map->kvalue.data, &link->link); > if (err) { > mutex_unlock(&update_mutex); > bpf_link_cleanup(&link_primer); > link = NULL; > goto err_out; > } > RCU_INIT_POINTER(link->map, map); > > mutex_unlock(&update_mutex); > Sure! According to what we discussed off-line, the RCU_INIT_POINTER() will be moved back to its original place. Subsystems should not try to access link->map. > > > > >> > >>> + mutex_unlock(&update_mutex); > >> >