Re: [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage

Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> · Mon, 31 Jul 2023 18:49:03 +0900

On Mon, Jul 24, 2023 at 11:40 AM Oliver Sang <oliver.sang@xxxxxxxxx> wrote:
>
> hi, Hyeonggon Yoo,
>
> On Thu, Jul 20, 2023 at 11:15:04PM +0900, Hyeonggon Yoo wrote:
> > On Thu, Jul 20, 2023 at 10:46 PM Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> wrote:
> > >
> > > On Thu, Jul 20, 2023 at 9:59 PM Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> wrote:
> > > > On Thu, Jul 20, 2023 at 12:01 PM Oliver Sang <oliver.sang@xxxxxxxxx> wrote:
> > > > > > > commit:
> > > > > > >   7bc162d5cc ("Merge branches 'slab/for-6.5/prandom', 'slab/for-6.5/slab_no_merge' and 'slab/for-6.5/slab-deprecate' into slab/for-next")
> > > > > > >   a0fd217e6d ("mm/slub: Optimize slub memory usage")
> > > > > > >
> > > > > > > 7bc162d5cc4de5c3 a0fd217e6d6fbd23e91f8796787
> > > > > > > ---------------- ---------------------------
> > > > > > >          %stddev     %change         %stddev
> > > > > > >              \          |                \
> > > > > > >     222503 ą 86%    +108.7%     464342 ą 58%  numa-meminfo.node1.Active
> > > > > > >     222459 ą 86%    +108.7%     464294 ą 58%  numa-meminfo.node1.Active(anon)
> > > > > > >      55573 ą 85%    +108.0%     115619 ą 58%  numa-vmstat.node1.nr_active_anon
> > > > > > >      55573 ą 85%    +108.0%     115618 ą 58%  numa-vmstat.node1.nr_zone_active_anon
> > > > > >
> > > > > > I'm quite baffled while reading this.
> > > > > > How did changing slab order calculation double the number of active anon pages?
> > > > > > I doubt two experiments were performed on the same settings.
> > > > >
> > > > > let me introduce our test process.
> > > > >
> > > > > we make sure the tests upon commit and its parent have exact same environment
> > > > > except the kernel difference, and we also make sure the config to build the
> > > > > commit and its parent are identical.
> > > > >
> > > > > we run tests for one commit at least 6 times to make sure the data is stable.
> > > > >
> > > > > such like for this case, we rebuild the commit and its parent's kernel, the
> > > > > config is attached FYI.
> > >
> > > Oh I missed the attachments.
> > > I need more time to look more into that, but could you please test
> > > this patch (attached)?
> >
> > Oh, my mistake. It has nothing to do with reclamation modifiers.
> > The correct patch should be this. Sorry for the noise.
>
> I applied below patch directly upon "mm/slub: Optimize slub memory usage",
> so our tree looks like below:
>
> * 6ba0286048431 (linux-devel/fixup-a0fd217e6d6fbd23e91f8796787b621e7d576088) mm/slub: do not allocate from remote node to allocate high order slab
> * a0fd217e6d6fb (linux-review/Jay-Patel/mm-slub-Optimize-slub-memory-usage/20230628-180050) mm/slub: Optimize slub memory usage
> *---.   7bc162d5cc4de (vbabka-slab/for-linus) Merge branches 'slab/for-6.5/prandom', 'slab/for-6.5/slab_no_merge' and 'slab/for-6.5/slab-deprecate' into slab/for-next
>
> 6ba0286048431 is as below [1]
> since there are some line number differences, no sure if my applying ok? or
> should I pick another base?

It was fine, it was tested correctly.

> by this applying, we noticed the regression still exists.
> on 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory

Thank you for testing it!
Unfortunately my guess seems to be wrong in this case,
based on information that Feng Tang gave us.

While I'm still interested in evaluating potential gains in SLUB,
for this case I would like to focus more on the v4 in this case as
Vlastimil pointed out!

Thanks,
Hyeonggon

> =========================================================================================
> compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
>   gcc-12/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp2/hackbench
>
> 7bc162d5cc4de5c3 a0fd217e6d6fbd23e91f8796787 6ba02860484315665e300d9f415
> ---------------- --------------------------- ---------------------------
>          %stddev     %change         %stddev     %change         %stddev
>              \          |                \          |                \
>     479042           -12.5%     419357           -12.0%     421407        hackbench.throughput
>
> detail data is attached as hackbench-6ba0286048431-ICL-Gold-6338
>
>
> on 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
>
> =========================================================================================
> compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
>   gcc-12/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp6/hackbench
>
> 7bc162d5cc4de5c3 a0fd217e6d6fbd23e91f8796787 6ba02860484315665e300d9f415
> ---------------- --------------------------- ---------------------------
>          %stddev     %change         %stddev     %change         %stddev
>              \          |                \          |                \
>     455347            -5.9%     428458            -6.4%     426221        hackbench.throughput
>
> detail data is attached as hackbench-6ba0286048431-ICL-Platinum-8358
>
>
> [1]
> commit 6ba02860484315665e300d9f41511f36940a50f0 (linux-devel/fixup-a0fd217e6d6fbd23e91f8796787b621e7d576088)
> Author: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
> Date:   Thu Jul 20 22:29:16 2023 +0900
>
>     mm/slub: do not allocate from remote node to allocate high order slab
>
>     Signed-off-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 8ea7a5ccac0dc..303c57ee0f560 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1981,7 +1981,7 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
>          * Let the initial higher-order allocation fail under memory pressure
>          * so we fall-back to the minimum order allocation.
>          */
> -       alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
> +       alloc_gfp = (flags | __GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
>         if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order(s->min))
>                 alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_RECLAIM;
>
>
>
>
>
> > From 74142b5131e731f662740d34623d93fd324f9b65 Mon Sep 17 00:00:00 2001
> > From: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
> > Date: Thu, 20 Jul 2023 22:29:16 +0900
> > Subject: [PATCH] mm/slub: do not allocate from remote node to allocate high
> >  order slab
> >
> > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
> > ---
> >  mm/slub.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index f7940048138c..c584237d6a0d 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -2010,7 +2010,7 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
> >        * Let the initial higher-order allocation fail under memory pressure
> >        * so we fall-back to the minimum order allocation.
> >        */
> > -     alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
> > +     alloc_gfp = (flags | __GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
> >       if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order(s->min))
> >               alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_RECLAIM;
> >
> > --
> > 2.41.0
> >
>