On Fri, Jun 26, 2020 at 07:41 AM CEST, Martin KaFai Lau wrote: > On Thu, Jun 25, 2020 at 04:13:55PM +0200, Jakub Sitnicki wrote: >> Prepare for having multi-prog attachments for new netns attach types by >> storing programs to run in a bpf_prog_array, which is well suited for >> iterating over programs and running them in sequence. >> >> After this change bpf(PROG_QUERY) may block to allocate memory in >> bpf_prog_array_copy_to_user() for collected program IDs. This forces a >> change in how we protect access to the attached program in the query >> callback. Because bpf_prog_array_copy_to_user() can sleep, we switch from >> an RCU read lock to holding a mutex that serializes updaters. >> >> Because we allow only one BPF flow_dissector program to be attached to >> netns at all times, the bpf_prog_array pointed by net->bpf.run_array is >> always either detached (null) or one element long. >> >> No functional changes intended. >> >> Signed-off-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> >> --- >> include/net/netns/bpf.h | 5 +- >> kernel/bpf/net_namespace.c | 120 +++++++++++++++++++++++++------------ >> net/core/flow_dissector.c | 19 +++--- >> 3 files changed, 96 insertions(+), 48 deletions(-) >> >> diff --git a/include/net/netns/bpf.h b/include/net/netns/bpf.h >> index a8dce2a380c8..a5015bda9979 100644 >> --- a/include/net/netns/bpf.h >> +++ b/include/net/netns/bpf.h >> @@ -9,9 +9,12 @@ >> #include <linux/bpf-netns.h> >> >> struct bpf_prog; >> +struct bpf_prog_array; >> >> struct netns_bpf { >> - struct bpf_prog __rcu *progs[MAX_NETNS_BPF_ATTACH_TYPE]; >> + /* Array of programs to run compiled from progs or links */ >> + struct bpf_prog_array __rcu *run_array[MAX_NETNS_BPF_ATTACH_TYPE]; >> + struct bpf_prog *progs[MAX_NETNS_BPF_ATTACH_TYPE]; >> struct bpf_link *links[MAX_NETNS_BPF_ATTACH_TYPE]; > With the new run_array, I think the "*progs[]" is not needed. > It seems the original "*progs[]" is only used to tell > if it is in the prog_attach mode or the newer link mode. > There is other ways to do that. > > It is something to think about when there is more clarity on how > multi netns prog will look like in the next set. Having just the run_array without *progs[] is something I've tried initially but ended up rewriting it. The end result was confusing to me. I couldn't convince myself to sign off on it and present it. Adding back the pointer to bpf_prog was counterintutivive, because it is wasteful, but it actually made the code readable. Best I can articulate why it didn't work out great (should have tagged the branch...) is that without *progs[] the run_array holds bpf_prog pointers sometimes with ref-count on the prog (old mode), and sometimes without one (new mode). This mixed state manifests itself mostly on netns teardown, where we need to access the prog_array to put the prog, but only if we're not using links. Then again, perhaps I simply messsed up the code back then, and it deserves another shot. Either way, getting rid of *progs[] is a potential optimization. [...]