> > On 3/31/24 4:19 AM, xiongwei.song@xxxxxxxxxxxxx wrote: > > From: Xiongwei Song <xiongwei.song@xxxxxxxxxxxxx> > > > > The break conditions can be more readable and simple. > > > > We can check if we need to fill cpu partial after getting the first > > partial slab. If kmem_cache_has_cpu_partial() returns true, we fill > > cpu partial from next iteration, or break up the loop. > > > > Then we can remove the preprocessor condition of > > CONFIG_SLUB_CPU_PARTIAL. Use dummy slub_get_cpu_partial() to make > > compiler silent. > > > > Signed-off-by: Xiongwei Song <xiongwei.song@xxxxxxxxxxxxx> > > --- > > mm/slub.c | 22 ++++++++++++---------- > > 1 file changed, 12 insertions(+), 10 deletions(-) > > > > diff --git a/mm/slub.c b/mm/slub.c > > index 590cc953895d..ec91c7435d4e 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -2614,18 +2614,20 @@ static struct slab *get_partial_node(struct kmem_cache *s, > > if (!partial) { > > partial = slab; > > stat(s, ALLOC_FROM_PARTIAL); > > - } else { > > - put_cpu_partial(s, slab, 0); > > - stat(s, CPU_PARTIAL_NODE); > > - partial_slabs++; > > + > > + /* Fill cpu partial if needed from next iteration, or break */ > > + if (kmem_cache_has_cpu_partial(s)) > > That kinda puts back the check removed in patch 1, although only in the > first iteration. Still not ideal. > > > + continue; > > + else > > + break; > > } > > -#ifdef CONFIG_SLUB_CPU_PARTIAL > > - if (partial_slabs > s->cpu_partial_slabs / 2) > > - break; > > -#else > > - break; > > -#endif > > I'd suggest intead of the changes done in this patch, only change this part > above to: > > if ((slub_get_cpu_partial(s) == 0) || > (partial_slabs > slub_get_cpu_partial(s) / 2)) > break; > > That gets rid of the #ifdef and also fixes a weird corner case that if we > set cpu_partial_slabs to 0 from sysfs, we still allocate at least one here. Oh, yes. Will update. > > It could be tempting to use >= instead of > to achieve the same effect but > that would have unintended performance effects that would best be evaluated > separately. I can run a test to measure Amean changes. But in terms of x86 assembly, there should not be extra instructions with ">=". Did a simple test, for ">=" it uses "jle" instruction, while "jl" instruction is used for ">". No more instructions involved. So there should not be performance effects on x86. Thanks, Xiongwei > > > > > + put_cpu_partial(s, slab, 0); > > + stat(s, CPU_PARTIAL_NODE); > > + partial_slabs++; > > + > > + if (partial_slabs > slub_get_cpu_partial(s) / 2) > > + break; > > } > > spin_unlock_irqrestore(&n->list_lock, flags); > > return partial;