Re: [PATCH] net/bridge: Optimizing read-write locks in ebtables.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




在 23/9/2024 下午5:29, Eric Dumazet 写道:
On Mon, Sep 23, 2024 at 11:16 AM yushengjin <yushengjin@xxxxxxxxxxxxx> wrote:
When conducting WRK testing, the CPU usage rate of the testing machine was
100%. forwarding through a bridge, if the network load is too high, it may
cause abnormal load on the ebt_do_table of the kernel ebtable module, leading
to excessive soft interrupts and sometimes even directly causing CPU soft
deadlocks.

After analysis, it was found that the code of ebtables had not been optimized
for a long time, and the read-write locks inside still existed. However, other
arp/ip/ip6 tables had already been optimized a lot, and performance bottlenecks
in read-write locks had been discovered a long time ago.

Ref link: https://lore.kernel.org/lkml/20090428092411.5331c4a1@nehalam/

So I referred to arp/ip/ip6 modification methods to optimize the read-write
lock in ebtables.c.

test method:
1) Test machine creates bridge :
``` bash
brctl addbr br-a
brctl addbr br-b
brctl addif br-a enp1s0f0 enp1s0f1
brctl addif br-b enp130s0f0 enp130s0f1
ifconfig br-a up
ifconfig br-b up
```
2) Testing with another machine:
``` bash
ulimit -n 2048
./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://4.4.4.2:80/4k.html &
./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://5.5.5.2:80/4k.html &
```

Signed-off-by: yushengjin <yushengjin@xxxxxxxxxxxxx>
---
  include/linux/netfilter_bridge/ebtables.h |  47 +++++++-
  net/bridge/netfilter/ebtables.c           | 132 ++++++++++++++++------
  2 files changed, 145 insertions(+), 34 deletions(-)

diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
index fd533552a062..dd52dea20fb8 100644
--- a/include/linux/netfilter_bridge/ebtables.h
+++ b/include/linux/netfilter_bridge/ebtables.h
@@ -93,7 +93,6 @@ struct ebt_table {
         char name[EBT_TABLE_MAXNAMELEN];
         struct ebt_replace_kernel *table;
         unsigned int valid_hooks;
-       rwlock_t lock;
         /* the data used by the kernel */
         struct ebt_table_info *private;
         struct nf_hook_ops *ops;
@@ -124,4 +123,50 @@ static inline bool ebt_invalid_target(int target)

  int ebt_register_template(const struct ebt_table *t, int(*table_init)(struct net *net));
  void ebt_unregister_template(const struct ebt_table *t);
+
+/**
+ * ebt_recseq - recursive seqcount for netfilter use
+ *
+ * Packet processing changes the seqcount only if no recursion happened
+ * get_counters() can use read_seqcount_begin()/read_seqcount_retry(),
+ * because we use the normal seqcount convention :
+ * Low order bit set to 1 if a writer is active.
+ */
+DECLARE_PER_CPU(seqcount_t, ebt_recseq);
+
+/**
+ * ebt_write_recseq_begin - start of a write section
+ *
+ * Begin packet processing : all readers must wait the end
+ * 1) Must be called with preemption disabled
+ * 2) softirqs must be disabled too (or we should use this_cpu_add())
+ * Returns :
+ *  1 if no recursion on this cpu
+ *  0 if recursion detected
+ */
+static inline unsigned int ebt_write_recseq_begin(void)
+{
+       unsigned int addend;
+
+       addend = (__this_cpu_read(ebt_recseq.sequence) + 1) & 1;
+
+       __this_cpu_add(ebt_recseq.sequence, addend);
+       smp_mb();
+
+       return addend;
+}
+
+/**
+ * ebt_write_recseq_end - end of a write section
+ * @addend: return value from previous ebt_write_recseq_begin()
+ *
+ * End packet processing : all readers can proceed
+ * 1) Must be called with preemption disabled
+ * 2) softirqs must be disabled too (or we should use this_cpu_add())
+ */
+static inline void ebt_write_recseq_end(unsigned int addend)
+{
+       smp_wmb();
+       __this_cpu_add(ebt_recseq.sequence, addend);
+}
Why not reusing xt_recseq, xt_write_recseq_begin(), xt_write_recseq_end(),
instead of copy/pasting them ?

This was added in

commit 7f5c6d4f665bb57a19a34ce1fb16cc708c04f219    netfilter: get rid
of atomic ops in fast path
They used different seqcounts, I'm worried it might have an impact.

If this is an include mess, just move them in a separate include file.

Can i copy  ebt_write_recseq_begin(), ebt_write_recseq_endend to
include/linux/netfilter/x_tables.h ?

Or add a parameter in xt_write_recseq_begin() , xt_write_recseq_end()  to
clarify whether it is xt_recseq or ebt_recseq.






[Index of Archives]     [Netdev]     [AoE Tools]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]     [Video 4 Linux]

  Powered by Linux