On Mon, Sep 23, 2024 at 11:16 AM yushengjin <yushengjin@xxxxxxxxxxxxx> wrote: > > When conducting WRK testing, the CPU usage rate of the testing machine was > 100%. forwarding through a bridge, if the network load is too high, it may > cause abnormal load on the ebt_do_table of the kernel ebtable module, leading > to excessive soft interrupts and sometimes even directly causing CPU soft > deadlocks. > > After analysis, it was found that the code of ebtables had not been optimized > for a long time, and the read-write locks inside still existed. However, other > arp/ip/ip6 tables had already been optimized a lot, and performance bottlenecks > in read-write locks had been discovered a long time ago. > > Ref link: https://lore.kernel.org/lkml/20090428092411.5331c4a1@nehalam/ > > So I referred to arp/ip/ip6 modification methods to optimize the read-write > lock in ebtables.c. > > test method: > 1) Test machine creates bridge : > ``` bash > brctl addbr br-a > brctl addbr br-b > brctl addif br-a enp1s0f0 enp1s0f1 > brctl addif br-b enp130s0f0 enp130s0f1 > ifconfig br-a up > ifconfig br-b up > ``` > 2) Testing with another machine: > ``` bash > ulimit -n 2048 > ./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://4.4.4.2:80/4k.html & > ./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://5.5.5.2:80/4k.html & > ``` > > Signed-off-by: yushengjin <yushengjin@xxxxxxxxxxxxx> > --- > include/linux/netfilter_bridge/ebtables.h | 47 +++++++- > net/bridge/netfilter/ebtables.c | 132 ++++++++++++++++------ > 2 files changed, 145 insertions(+), 34 deletions(-) > > diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h > index fd533552a062..dd52dea20fb8 100644 > --- a/include/linux/netfilter_bridge/ebtables.h > +++ b/include/linux/netfilter_bridge/ebtables.h > @@ -93,7 +93,6 @@ struct ebt_table { > char name[EBT_TABLE_MAXNAMELEN]; > struct ebt_replace_kernel *table; > unsigned int valid_hooks; > - rwlock_t lock; > /* the data used by the kernel */ > struct ebt_table_info *private; > struct nf_hook_ops *ops; > @@ -124,4 +123,50 @@ static inline bool ebt_invalid_target(int target) > > int ebt_register_template(const struct ebt_table *t, int(*table_init)(struct net *net)); > void ebt_unregister_template(const struct ebt_table *t); > + > +/** > + * ebt_recseq - recursive seqcount for netfilter use > + * > + * Packet processing changes the seqcount only if no recursion happened > + * get_counters() can use read_seqcount_begin()/read_seqcount_retry(), > + * because we use the normal seqcount convention : > + * Low order bit set to 1 if a writer is active. > + */ > +DECLARE_PER_CPU(seqcount_t, ebt_recseq); > + > +/** > + * ebt_write_recseq_begin - start of a write section > + * > + * Begin packet processing : all readers must wait the end > + * 1) Must be called with preemption disabled > + * 2) softirqs must be disabled too (or we should use this_cpu_add()) > + * Returns : > + * 1 if no recursion on this cpu > + * 0 if recursion detected > + */ > +static inline unsigned int ebt_write_recseq_begin(void) > +{ > + unsigned int addend; > + > + addend = (__this_cpu_read(ebt_recseq.sequence) + 1) & 1; > + > + __this_cpu_add(ebt_recseq.sequence, addend); > + smp_mb(); > + > + return addend; > +} > + > +/** > + * ebt_write_recseq_end - end of a write section > + * @addend: return value from previous ebt_write_recseq_begin() > + * > + * End packet processing : all readers can proceed > + * 1) Must be called with preemption disabled > + * 2) softirqs must be disabled too (or we should use this_cpu_add()) > + */ > +static inline void ebt_write_recseq_end(unsigned int addend) > +{ > + smp_wmb(); > + __this_cpu_add(ebt_recseq.sequence, addend); > +} Why not reusing xt_recseq, xt_write_recseq_begin(), xt_write_recseq_end(), instead of copy/pasting them ? This was added in commit 7f5c6d4f665bb57a19a34ce1fb16cc708c04f219 netfilter: get rid of atomic ops in fast path If this is an include mess, just move them in a separate include file.