On 25/9/24 13:32, Eduard Zingerman wrote: > On Mon, 2024-09-23 at 21:40 +0800, Leon Hwang wrote: > > [...] > >> diff --git a/include/linux/bpf.h b/include/linux/bpf.h >> index 048aa2625cbef..b864b37e67c17 100644 >> --- a/include/linux/bpf.h >> +++ b/include/linux/bpf.h >> @@ -1484,6 +1484,7 @@ struct bpf_prog_aux { >> bool exception_cb; >> bool exception_boundary; >> bool is_extended; /* true if extended by freplace program */ >> + atomic_t tail_callee_cnt; > > Nit: the name is a bit misleading, this counts how many times the > program resides it prog maps. Confusing w/o additional comments. > Maybe something like 'member_of_prog_array_cnt'? > 'member_of_prog_array_cnt' is not accurate enough. 'prog_array_member_cnt' is better, and should alongside comment /* counts how many times as member of prog_array */. >> struct bpf_arena *arena; >> /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */ >> const struct btf_type *attach_func_proto; >> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c >> index 8d97bae98fa70..c12e0e3bf6ad0 100644 >> --- a/kernel/bpf/arraymap.c >> +++ b/kernel/bpf/arraymap.c >> @@ -961,13 +961,17 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map, >> return ERR_PTR(-EINVAL); >> } >> >> + atomic_inc(&prog->aux->tail_callee_cnt); >> return prog; >> } > > [...] > >> static u32 prog_fd_array_sys_lookup_elem(void *ptr) >> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c >> index 18b3f9216b050..be829016d8182 100644 >> --- a/kernel/bpf/syscall.c >> +++ b/kernel/bpf/syscall.c >> @@ -3501,6 +3501,18 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog, >> tgt_prog = prog->aux->dst_prog; >> } >> >> + if (prog->type == BPF_PROG_TYPE_EXT && >> + atomic_read(&tgt_prog->aux->tail_callee_cnt)) { >> + /* Program extensions can not extend target prog when the target >> + * prog has been updated to any prog_array map as tail callee. >> + * It's to prevent a potential infinite loop like: >> + * tgt prog entry -> tgt prog subprog -> freplace prog entry >> + * --tailcall-> tgt prog entry. >> + */ >> + err = -EINVAL; >> + goto out_unlock; >> + } >> + >> err = bpf_link_prime(&link->link.link, &link_primer); >> if (err) >> goto out_unlock; > > Is it possible there is a race between map update and prog attach? Yes, it is possible. > E.g. suppose the following sequence of events: > - thread #1 enters prog_fd_array_get_ptr() > - thread #1 successfully completes prog->aux->is_extended check (not extended) > - thread #2 enters bpf_tracing_prog_attach() > - thread #2 does atomic_read() for tgt_prog and it returns 0 > - thread #2 proceeds attaching freplace to tgt_prog > - thread #1 does atomic_inc(&prog->aux->tail_callee_cnt) > > Thus arriving to a state when tgt_prog is both a member of a map and > is freplaced. Is this a valid scenario? > This patch series aims to prevent such case that tgt_prog is a member of prog_array and is freplaced at the same time. Without this patch series, a prog can be extended by freplace prog and then be updated to prog_array, or can be updated to prog_array and then be extended by freplace prog, in order to construct such case. This patch aims to prevent "be updated to prog_array and then be extended by freplace prog". The previous patch aims to prevent "be extended by freplace prog and then be updated to prog_array". So, in order to avoid the above case: diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index a43e62e2a8bb..da4e26029a33 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -948,7 +948,9 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map, if (IS_ERR(prog)) return prog; - if (!bpf_prog_map_compatible(map, prog)) { + atomic_inc(&prog->aux->tail_callee_cnt); + if (!bpf_prog_map_compatible(map, prog) || prog->aux->is_extended) { + atomic_dec(&prog->aux->tail_callee_cnt); bpf_prog_put(prog); return ERR_PTR(-EINVAL); } 1. Increment tail_callee_cnt. 2. Decrement tail_callee_cnt, if prog->aux->is_extended. Then, thread #2 does atomic_read() for tgt_prog, and it won't return 0. Thanks, Leon