On Sun, 9 Jul 2023, David Rientjes wrote: > There are some substantial performance degradations, most notably > context_switch1_per_thread_ops which regressed ~21%. I'll need to repeat > that test to confirm it and can also try on cascadelake if it reproduces. > So the regression on skylake for will-it-scale appears to be real: LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION ----------------------------------+-------+------------+------------+------------+------------+--------+------------ context_switch1_per_thread_ops | | | | | | | (A) v6.1.30 | 1 | 314507.000 | 314507.000 | 314507.000 | 314507.000 | 0 | (B) v6.1.30 slab_nomerge | 1 | 257403.000 | 257403.000 | 257403.000 | 257403.000 | 0 | !! REGRESSED !! | | -18.16% | -18.16% | -18.16% | -18.16% | --- | + is good but I can't reproduce this on cascadelake: LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION ----------------------------------+-------+------------+------------+------------+------------+--------+------------ context_switch1_per_thread_ops | | | | | | | (A) v6.1.30 | 1 | 301128.000 | 301128.000 | 301128.000 | 301128.000 | 0 | (B) v6.1.30 slab_nomerge | 1 | 301282.000 | 301282.000 | 301282.000 | 301282.000 | 0 | | | +0.05% | +0.05% | +0.05% | +0.05% | --- | + is good So I'm a bit baffled at the moment. I'll try to dig deeper and see what slab caches this benchmark exercises that apparently no other benchmarks do. (I'm really hoping that the only way to recover this performance is by something like kmem_cache_create(SLAB_MERGE).)