On Thu, 2024-09-26 at 15:19 +0800, Leon Hwang wrote: [...] > > E.g. suppose the following sequence of events: > > - thread #1 enters prog_fd_array_get_ptr() > > - thread #1 successfully completes prog->aux->is_extended check (not extended) > > - thread #2 enters bpf_tracing_prog_attach() > > - thread #2 does atomic_read() for tgt_prog and it returns 0 > > - thread #2 proceeds attaching freplace to tgt_prog > > - thread #1 does atomic_inc(&prog->aux->tail_callee_cnt) > > > > Thus arriving to a state when tgt_prog is both a member of a map and > > is freplaced. Is this a valid scenario? > > > > This patch series aims to prevent such case that tgt_prog is a member of > prog_array and is freplaced at the same time. > > Without this patch series, a prog can be extended by freplace prog and then > be updated to prog_array, or can be updated to prog_array and then be > extended by freplace prog, in order to construct such case. > > This patch aims to prevent "be updated to prog_array and then be extended > by freplace prog". > The previous patch aims to prevent "be extended by freplace prog and then > be updated to prog_array". > > So, in order to avoid the above case: > > diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c > index a43e62e2a8bb..da4e26029a33 100644 > --- a/kernel/bpf/arraymap.c > +++ b/kernel/bpf/arraymap.c > @@ -948,7 +948,9 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map, > if (IS_ERR(prog)) > return prog; > > - if (!bpf_prog_map_compatible(map, prog)) { > + atomic_inc(&prog->aux->tail_callee_cnt); > + if (!bpf_prog_map_compatible(map, prog) || prog->aux->is_extended) { > + atomic_dec(&prog->aux->tail_callee_cnt); > bpf_prog_put(prog); > return ERR_PTR(-EINVAL); > } I'm not sure this really solves the issue. Documentation for both 'atomic_inc' and 'atomic_read' (used in bpf_tracing_prog_attach()) says that these are operations with relaxed memory ordering. Meaning that e.g. 'atomic_inc' executed inside prog_fd_array_get_ptr() is not necessarily immediately visible for other thread executing 'atomic_read' in bpf_tracing_prog_attach(). I think that some memory barrier is needed (non-relaxed func variant). But all this gets unnecessarily complicated, neither prog_fd_array_get_ptr() nor bpf_tracing_prog_attach() are executed often, I think that 'tail_callee_cnt' and 'is_extended' should be protected by a mutex.