From: Xiongwei Song <xiongwei.song@xxxxxxxxxxxxx> The break conditions for filling cpu partial can be more readable and simple. If slub_get_cpu_partial() returns 0, we can confirm that we don't need to fill cpu partial, then we should break from the loop. On the other hand, we also should break from the loop if we have added enough cpu partial slabs. Meanwhile, the logic above gets rid of the #ifdef and also fixes a weird corner case that if we set cpu_partial_slabs to 0 from sysfs, we still allocate at least one here. Signed-off-by: Xiongwei Song <xiongwei.song@xxxxxxxxxxxxx> --- The measurement below is to compare the performance effects when checking if we need to break from the filling cpu partial loop with the following either-or condition: Condition 1: When the count of added cpu slabs is greater than cpu_partial_slabs/2: (partial_slabs > slub_get_cpu_partial(s) / 2) Condition 2: When the count of added cpu slabs is greater than or equal to cpu_partial_slabs/2: (partial_slabs >= slub_get_cpu_partial(s) / 2) The change of breaking condition can effect how many cpu partial slabs would be put on the cpu partial list. Run the test with a "Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz" cpu with 16 cores. The OS is Ubuntu 22.04. hackbench-process-pipes 6.9-rc2(with ">") 6.9.0-rc2(with ">=") Amean 1 0.0373 ( 0.00%) 0.0356 * 4.60%* Amean 4 0.0984 ( 0.00%) 0.1014 * -3.05%* Amean 7 0.1803 ( 0.00%) 0.1851 * -2.69%* Amean 12 0.2947 ( 0.00%) 0.3141 * -6.59%* Amean 21 0.4577 ( 0.00%) 0.4927 * -7.65%* Amean 30 0.6326 ( 0.00%) 0.6649 * -5.10%* Amean 48 0.9396 ( 0.00%) 0.9884 * -5.20%* Amean 64 1.2321 ( 0.00%) 1.3004 * -5.54%* hackbench-process-sockets 6.9-rc2(with ">") 6.9.0-rc2(with ">=") Amean 1 0.0609 ( 0.00%) 0.0623 * -2.35%* Amean 4 0.2107 ( 0.00%) 0.2140 * -1.56%* Amean 7 0.3754 ( 0.00%) 0.3966 * -5.63%* Amean 12 0.6456 ( 0.00%) 0.6734 * -4.32%* Amean 21 1.1440 ( 0.00%) 1.1769 * -2.87%* Amean 30 1.6629 ( 0.00%) 1.7031 * -2.42%* Amean 48 2.7321 ( 0.00%) 2.7897 * -2.11%* Amean 64 3.7397 ( 0.00%) 3.7640 * -0.65%* It seems there is a bit performance penalty when using ">=" to break up the loop. Hence, we should still use ">" here. --- mm/slub.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 590cc953895d..6beff3b1e22c 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2619,13 +2619,10 @@ static struct slab *get_partial_node(struct kmem_cache *s, stat(s, CPU_PARTIAL_NODE); partial_slabs++; } -#ifdef CONFIG_SLUB_CPU_PARTIAL - if (partial_slabs > s->cpu_partial_slabs / 2) - break; -#else - break; -#endif + if ((slub_get_cpu_partial(s) == 0) || + (partial_slabs > slub_get_cpu_partial(s) / 2)) + break; } spin_unlock_irqrestore(&n->list_lock, flags); return partial; -- 2.34.1