From: Hou Tao <houtao1@xxxxxxxxxx> Hi, V5 incorporates suggestions from Alexei and Paul (Big thanks for that). The main changes includes: *) Use per-cpu list for reusable list and freeing list to reduce lock contention and retain numa-ware attribute *) Use multiple RCU callback for reuse as v3 did *) Use rcu_momentary_dyntick_idle() to reduce the peak memory footprint Please see individual patches for more details. As ususal comments and suggestions are always welcome. Change Log: v5: * remove prepare_reuse_head and prepare_reuse_tail * use 32 as both low_watermark and high_watermark * use per-cpu list for reusable list and freeing list * use multiple RCU callbacks to do object reuse * remove *_tail for all lists * use rcu_momentary_dyntick_idle() to shorten RCU grace period v4: https://lore.kernel.org/bpf/20230606035310.4026145-1-houtao@xxxxxxxxxxxxxxx/ * no kworker (Alexei) * Use a global reusable list in bpf memory allocator (Alexei) * Remove BPF_MA_FREE_AFTER_RCU_GP flag and do reuse-after-rcu-gp defaultly in bpf memory allocator (Alexei) * add benchmark results from map_perf_test (Alexei) v3: https://lore.kernel.org/bpf/20230429101215.111262-1-houtao@xxxxxxxxxxxxxxx/ * add BPF_MA_FREE_AFTER_RCU_GP bpf memory allocator * Update htab memory benchmark * move the benchmark patch to the last patch * remove array and useless bpf_map_lookup_elem(&array, ...) in bpf programs * add synchronization between addition CPU and deletion CPU for add_del_on_diff_cpu case to prevent unnecessary loop * add the benchmark result for "extra call_rcu + bpf ma" v2: https://lore.kernel.org/bpf/20230408141846.1878768-1-houtao@xxxxxxxxxxxxxxx/ * add a benchmark for bpf memory allocator to compare between different flavor of bpf memory allocator. * implement BPF_MA_REUSE_AFTER_RCU_GP for bpf memory allocator. v1: https://lore.kernel.org/bpf/20221230041151.1231169-1-houtao@xxxxxxxxxxxxxxx/ Hou Tao (2): bpf: Only reuse after one RCU GP in bpf memory allocator bpf: Call rcu_momentary_dyntick_idle() in task work periodically kernel/bpf/memalloc.c | 371 ++++++++++++++++++++++++++++-------------- 1 file changed, 250 insertions(+), 121 deletions(-) -- 2.29.2