On 09.02.23 16:01, Marcelo Tosatti wrote:
In preparation to switch vmstat shepherd to flush per-CPU counters remotely, switch all functions that modify the counters to use cmpxchg. To test the performance difference, a page allocator microbenchmark: https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench01.c with loops=1000000 was used, on Intel Core i7-11850H @ 2.50GHz. For the single_page_alloc_free test, which does /** Loop to measure **/ for (i = 0; i < rec->loops; i++) { my_page = alloc_page(gfp_mask); if (unlikely(my_page == NULL)) return 0; __free_page(my_page); } Unit is cycles. Vanilla Patched Diff 159 165 3.7% Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> Index: linux-vmstat-remote/mm/vmstat.c =================================================================== --- linux-vmstat-remote.orig/mm/vmstat.c +++ linux-vmstat-remote/mm/vmstat.c @@ -334,6 +334,188 @@ void set_pgdat_percpu_threshold(pg_data_ } }
I wonder why we get a diff that is rather hard to review because it removes all existing codes and replaces it by almost-identical code. Are you maybe moving a bunch of code while modifying some tiny bits at the same time?
-- Thanks, David / dhildenb