> On 4/4/24 7:58 AM, xiongwei.song@xxxxxxxxxxxxx wrote: > > From: Xiongwei Song <xiongwei.song@xxxxxxxxxxxxx> > > > > The break conditions for filling cpu partial can be more readable and > > simple. > > > > If slub_get_cpu_partial() returns 0, we can confirm that we don't need > > to fill cpu partial, then we should break from the loop. On the other > > hand, we also should break from the loop if we have added enough cpu > > partial slabs. > > > > Meanwhile, the logic above gets rid of the #ifdef and also fixes a weird > > corner case that if we set cpu_partial_slabs to 0 from sysfs, we still > > allocate at least one here. > > > > Signed-off-by: Xiongwei Song <xiongwei.song@xxxxxxxxxxxxx> > > --- > > > > The measurement below is to compare the performance effects when > checking > > if we need to break from the filling cpu partial loop with the following > > either-or condition: > > > > Condition 1: > > When the count of added cpu slabs is greater than cpu_partial_slabs/2: > > (partial_slabs > slub_get_cpu_partial(s) / 2) > > > > Condition 2: > > When the count of added cpu slabs is greater than or equal to > > cpu_partial_slabs/2: > > (partial_slabs >= slub_get_cpu_partial(s) / 2) > > > > The change of breaking condition can effect how many cpu partial slabs > > would be put on the cpu partial list. > > > > Run the test with a "Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz" cpu with > > 16 cores. The OS is Ubuntu 22.04. > > > > hackbench-process-pipes > > 6.9-rc2(with ">") 6.9.0-rc2(with ">=") > > Amean 1 0.0373 ( 0.00%) 0.0356 * 4.60%* > > Amean 4 0.0984 ( 0.00%) 0.1014 * -3.05%* > > Amean 7 0.1803 ( 0.00%) 0.1851 * -2.69%* > > Amean 12 0.2947 ( 0.00%) 0.3141 * -6.59%* > > Amean 21 0.4577 ( 0.00%) 0.4927 * -7.65%* > > Amean 30 0.6326 ( 0.00%) 0.6649 * -5.10%* > > Amean 48 0.9396 ( 0.00%) 0.9884 * -5.20%* > > Amean 64 1.2321 ( 0.00%) 1.3004 * -5.54%* > > > > hackbench-process-sockets > > 6.9-rc2(with ">") 6.9.0-rc2(with ">=") > > Amean 1 0.0609 ( 0.00%) 0.0623 * -2.35%* > > Amean 4 0.2107 ( 0.00%) 0.2140 * -1.56%* > > Amean 7 0.3754 ( 0.00%) 0.3966 * -5.63%* > > Amean 12 0.6456 ( 0.00%) 0.6734 * -4.32%* > > Amean 21 1.1440 ( 0.00%) 1.1769 * -2.87%* > > Amean 30 1.6629 ( 0.00%) 1.7031 * -2.42%* > > Amean 48 2.7321 ( 0.00%) 2.7897 * -2.11%* > > Amean 64 3.7397 ( 0.00%) 3.7640 * -0.65%* > > > > It seems there is a bit performance penalty when using ">=" to break up > > the loop. Hence, we should still use ">" here. > > Thanks for evaluating that, I suspected that would be the case so we should > not change that performance aspect as part of a cleanup. > > > --- > > mm/slub.c | 9 +++------ > > 1 file changed, 3 insertions(+), 6 deletions(-) > > > > diff --git a/mm/slub.c b/mm/slub.c > > index 590cc953895d..6beff3b1e22c 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -2619,13 +2619,10 @@ static struct slab *get_partial_node(struct > kmem_cache *s, > > stat(s, CPU_PARTIAL_NODE); > > partial_slabs++; > > } > > -#ifdef CONFIG_SLUB_CPU_PARTIAL > > - if (partial_slabs > s->cpu_partial_slabs / 2) > > - break; > > -#else > > - break; > > -#endif > > > > + if ((slub_get_cpu_partial(s) == 0) || > > + (partial_slabs > slub_get_cpu_partial(s) / 2)) > > + break; > > } > > spin_unlock_irqrestore(&n->list_lock, flags); > > return partial; > > After looking at the result and your v1 again, I arrived at this > modification that incorporates the core v1 idea without reintroducing > kmem_cache_has_cpu_partial(). The modified patch looks like below. Is it OK > with you? Pushed the whole series with this modification to slab/for-next > for now. Sorry for the late response, I was on vacation. I'm ok with the patch below. Thanks, Xiongwei > > ----8<----- > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2614,18 +2614,17 @@ static struct slab *get_partial_node(struct > kmem_cache *s, > if (!partial) { > partial = slab; > stat(s, ALLOC_FROM_PARTIAL); > + if ((slub_get_cpu_partial(s) == 0)) { > + break; > + } > } else { > put_cpu_partial(s, slab, 0); > stat(s, CPU_PARTIAL_NODE); > - partial_slabs++; > - } > -#ifdef CONFIG_SLUB_CPU_PARTIAL > - if (partial_slabs > s->cpu_partial_slabs / 2) > - break; > -#else > - break; > -#endif > > + if (++partial_slabs > slub_get_cpu_partial(s) / 2) { > + break; > + } > + } > } > spin_unlock_irqrestore(&n->list_lock, flags); > return partial;