At 2024-01-14 02:24:07, "Jozsef Kadlecsik" <kadlec@xxxxxxxxxxxxxxxxx> wrote: >On Thu, 11 Jan 2024, David Wang wrote: > >> I tested the patch with code stressing swap->destroy->create->add 10000 >> times, the performance regression still happens, and now it is >> ip_set_destroy. (I pasted the test code at the end of this mail) >> >> They all call wait_for_completion, which may sleep on something on >> purpose, I guess... > >That's OK because ip_set_destroy() calls rcu_barrier() which is needed to >handle flush in list type of sets. > >However, rcu_barrier() with call_rcu() together makes multiple destroys >one after another slow. But rcu_barrier() is needed for list type of sets >only and that can be handled separately. So could you test the patch >below? According to my tests it is even a little bit faster than the >original code before synchronize_rcu() was added to swap. Confirmed~! This patch does fix the performance regression in my case. Hope it can fix ale.crismani@xxxxxxxxxxxxxx's original issue. Thanks~ David