On Thu, Jun 25, 2020 at 10:50 PM CEST, Andrii Nakryiko wrote: > On Thu, Jun 25, 2020 at 7:17 AM Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> wrote: >> >> Prepare for having multi-prog attachments for new netns attach types by >> storing programs to run in a bpf_prog_array, which is well suited for >> iterating over programs and running them in sequence. >> >> After this change bpf(PROG_QUERY) may block to allocate memory in >> bpf_prog_array_copy_to_user() for collected program IDs. This forces a >> change in how we protect access to the attached program in the query >> callback. Because bpf_prog_array_copy_to_user() can sleep, we switch from >> an RCU read lock to holding a mutex that serializes updaters. >> >> Because we allow only one BPF flow_dissector program to be attached to >> netns at all times, the bpf_prog_array pointed by net->bpf.run_array is >> always either detached (null) or one element long. >> >> No functional changes intended. >> >> Signed-off-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> >> --- > > I wonder if instead of NULL prog_array, it's better to just use a > dummy empty (but allocated) array. Might help eliminate some of the > IFs, maybe even in the hot path. That was my initial approach, which I abandoned seeing that it leads to replacing NULL prog_array checks in flow_dissector with bpf_prog_array_is_empty() checks to determine which netns has a BPF program attached. So no IFs gone there. While on the hot path, where we run the program, we probably would still be left with an IF checking for empty prog_array to avoid building the context if no progs will RUN. The checks I'm referring to are on attach path, in flow_dissector_bpf_prog_attach_check(), and hot-path, __skb_flow_dissect(). > > >> include/net/netns/bpf.h | 5 +- >> kernel/bpf/net_namespace.c | 120 +++++++++++++++++++++++++------------ >> net/core/flow_dissector.c | 19 +++--- >> 3 files changed, 96 insertions(+), 48 deletions(-) >> > > [...] > > >> >> +/* Must be called with netns_bpf_mutex held. */ >> +static int __netns_bpf_prog_query(const union bpf_attr *attr, >> + union bpf_attr __user *uattr, >> + struct net *net, >> + enum netns_bpf_attach_type type) >> +{ >> + __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids); >> + struct bpf_prog_array *run_array; >> + u32 prog_cnt = 0, flags = 0; >> + >> + run_array = rcu_dereference_protected(net->bpf.run_array[type], >> + lockdep_is_held(&netns_bpf_mutex)); >> + if (run_array) >> + prog_cnt = bpf_prog_array_length(run_array); >> + >> + if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags))) >> + return -EFAULT; >> + if (copy_to_user(&uattr->query.prog_cnt, &prog_cnt, sizeof(prog_cnt))) >> + return -EFAULT; >> + if (!attr->query.prog_cnt || !prog_ids || !prog_cnt) >> + return 0; >> + >> + return bpf_prog_array_copy_to_user(run_array, prog_ids, >> + attr->query.prog_cnt); > > It doesn't seem like bpf_prog_array_copy_to_user can handle NULL run_array Correct. And we never invoke it when run_array is NULL because then prog_cnt == 0. > >> +} >> + >> int netns_bpf_prog_query(const union bpf_attr *attr, >> union bpf_attr __user *uattr) >> { >> - __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids); >> - u32 prog_id, prog_cnt = 0, flags = 0; >> enum netns_bpf_attach_type type; >> - struct bpf_prog *attached; >> struct net *net; >> + int ret; >> >> if (attr->query.query_flags) >> return -EINVAL; > > [...]