On Fri, Jun 26, 2020 at 2:45 AM Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> wrote: > > On Thu, Jun 25, 2020 at 10:50 PM CEST, Andrii Nakryiko wrote: > > On Thu, Jun 25, 2020 at 7:17 AM Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> wrote: > >> > >> Prepare for having multi-prog attachments for new netns attach types by > >> storing programs to run in a bpf_prog_array, which is well suited for > >> iterating over programs and running them in sequence. > >> > >> After this change bpf(PROG_QUERY) may block to allocate memory in > >> bpf_prog_array_copy_to_user() for collected program IDs. This forces a > >> change in how we protect access to the attached program in the query > >> callback. Because bpf_prog_array_copy_to_user() can sleep, we switch from > >> an RCU read lock to holding a mutex that serializes updaters. > >> > >> Because we allow only one BPF flow_dissector program to be attached to > >> netns at all times, the bpf_prog_array pointed by net->bpf.run_array is > >> always either detached (null) or one element long. > >> > >> No functional changes intended. > >> > >> Signed-off-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > >> --- > > > > I wonder if instead of NULL prog_array, it's better to just use a > > dummy empty (but allocated) array. Might help eliminate some of the > > IFs, maybe even in the hot path. > > That was my initial approach, which I abandoned seeing that it leads to > replacing NULL prog_array checks in flow_dissector with > bpf_prog_array_is_empty() checks to determine which netns has a BPF > program attached. So no IFs gone there. > > While on the hot path, where we run the program, we probably would still > be left with an IF checking for empty prog_array to avoid building the > context if no progs will RUN. > > The checks I'm referring to are on attach path, in > flow_dissector_bpf_prog_attach_check(), and hot-path, > __skb_flow_dissect(). > Fair enough. > > > > > >> include/net/netns/bpf.h | 5 +- > >> kernel/bpf/net_namespace.c | 120 +++++++++++++++++++++++++------------ > >> net/core/flow_dissector.c | 19 +++--- > >> 3 files changed, 96 insertions(+), 48 deletions(-) > >> > > > > [...] > > > > > >> > >> +/* Must be called with netns_bpf_mutex held. */ > >> +static int __netns_bpf_prog_query(const union bpf_attr *attr, > >> + union bpf_attr __user *uattr, > >> + struct net *net, > >> + enum netns_bpf_attach_type type) > >> +{ > >> + __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids); > >> + struct bpf_prog_array *run_array; > >> + u32 prog_cnt = 0, flags = 0; > >> + > >> + run_array = rcu_dereference_protected(net->bpf.run_array[type], > >> + lockdep_is_held(&netns_bpf_mutex)); > >> + if (run_array) > >> + prog_cnt = bpf_prog_array_length(run_array); > >> + > >> + if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags))) > >> + return -EFAULT; > >> + if (copy_to_user(&uattr->query.prog_cnt, &prog_cnt, sizeof(prog_cnt))) > >> + return -EFAULT; > >> + if (!attr->query.prog_cnt || !prog_ids || !prog_cnt) > >> + return 0; > >> + > >> + return bpf_prog_array_copy_to_user(run_array, prog_ids, > >> + attr->query.prog_cnt); > > > > It doesn't seem like bpf_prog_array_copy_to_user can handle NULL run_array > > Correct. And we never invoke it when run_array is NULL because then > prog_cnt == 0. Oh, that !prog_cnt above, right.. it's easy to miss. > > > > >> +} > >> + > >> int netns_bpf_prog_query(const union bpf_attr *attr, > >> union bpf_attr __user *uattr) > >> { > >> - __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids); > >> - u32 prog_id, prog_cnt = 0, flags = 0; > >> enum netns_bpf_attach_type type; > >> - struct bpf_prog *attached; > >> struct net *net; > >> + int ret; > >> > >> if (attr->query.query_flags) > >> return -EINVAL; > > > > [...]