Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

Johannes Weiner <hannes@xxxxxxxxxxx> · Thu, 12 Oct 2023 22:33:29 -0400

On Thu, Oct 12, 2023 at 04:28:49PM -0700, Yosry Ahmed wrote:
> [..]
> > > >
> > > > Using next-20231009 and a similar 44 core machine with hyperthreading
> > > > disabled, I ran 22 instances of netperf in parallel and got the
> > > > following numbers from averaging 20 runs:
> > > >
> > > > Base: 33076.5 mbps
> > > > Patched: 31410.1 mbps
> > > >
> > > > That's about 5% diff. I guess the number of iterations helps reduce
> > > > the noise? I am not sure.
> > > >
> > > > Please also keep in mind that in this case all netperf instances are
> > > > in the same cgroup and at a 4-level depth. I imagine in a practical
> > > > setup processes would be a little more spread out, which means less
> > > > common ancestors, so less contended atomic operations.
> > >
> > >
> > > (Resending the reply as I messed up the last one, was not in plain text)
> > >
> > > I was curious, so I ran the same testing in a cgroup 2 levels deep
> > > (i.e /sys/fs/cgroup/a/b), which is a much more common setup in my
> > > experience. Here are the numbers:
> > >
> > > Base: 40198.0 mbps
> > > Patched: 38629.7 mbps
> > >
> > > The regression is reduced to ~3.9%.
> > >
> > > What's more interesting is that going from a level 2 cgroup to a level
> > > 4 cgroup is already a big hit with or without this patch:
> > >
> > > Base: 40198.0 -> 33076.5 mbps (~17.7% regression)
> > > Patched: 38629.7 -> 31410.1 (~18.7% regression)
> > >
> > > So going from level 2 to 4 is already a significant regression for
> > > other reasons (e.g. hierarchical charging). This patch only makes it
> > > marginally worse. This puts the numbers more into perspective imo than
> > > comparing values at level 4. What do you think?
> >
> > I think it's reasonable.
> >
> > Especially comparing to how many cachelines we used to touch on the
> > write side when all flushing happened there. This looks like a good
> > trade-off to me.
> 
> Thanks.
> 
> Still wanting to figure out if this patch is what you suggested in our
> previous discussion [1], to add a
> Suggested-by if appropriate :)
> 
> [1]https://lore.kernel.org/lkml/20230913153758.GB45543@xxxxxxxxxxx/

Haha, sort of. I suggested the cgroup-level flush-batching, but my
proposal was missing the clever upward propagation of the pending stat
updates that you added.

You can add the tag if you're feeling generous, but I wouldn't be mad
if you don't!