Re: [PATCH v18 00/32] per memcg lru_lock

Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> · Mon, 24 Aug 2020 21:56:27 -0400

On Mon, Aug 24, 2020 at 01:24:20PM -0700, Hugh Dickins wrote:
> On Mon, 24 Aug 2020, Andrew Morton wrote:
> > On Mon, 24 Aug 2020 20:54:33 +0800 Alex Shi <alex.shi@xxxxxxxxxxxxxxxxx> wrote:
> Andrew demurred on version 17 for lack of review.  Alexander Duyck has
> been doing a lot on that front since then.  I have intended to do so,
> but it's a mirage that moves away from me as I move towards it: I have

Same, I haven't been able to keep up with the versions or the recent review
feedback.  I got through about half of v17 last week and hope to have more time
for the rest this week and beyond.

> > > Following Daniel Jordan's suggestion, I have run 208 'dd' with on 104
> > > containers on a 2s * 26cores * HT box with a modefied case:

Alex, do you have a pointer to the modified readtwice case?

Even better would be a description of the problem you're having in production
with lru_lock.  We might be able to create at least a simulation of it to show
what the expected improvement of your real workload is.

> > > https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice
> > > With this patchset, the readtwice performance increased about 80%
> > > in concurrent containers.
> > 
> > That's rather a slight amount of performance testing for a huge
> > performance patchset!
> 
> Indeed.  And I see that clause about readtwice performance increased 80%
> going back eight months to v6: a lot of fundamental bugs have been fixed
> in it since then, so I do think it needs refreshing.  It could be faster
> now: v16 or v17 fixed the last bug I knew of, which had been slowing
> down reclaim considerably.
> 
> When I last timed my repetitive swapping loads (not loads anyone sensible
> would be running with), across only two memcgs, Alex's patchset was
> slightly faster than without: it really did make a difference.  But
> I tend to think that for all patchsets, there exists at least one
> test that shows it faster, and another that shows it slower.
> 
> > Is more detailed testing planned?
> 
> Not by me, performance testing is not something I trust myself with,
> just get lost in the numbers: Alex, this is what we hoped for months
> ago, please make a more convincing case, I hope Daniel and others
> can make more suggestions.  But my own evidence suggests it's good.

I ran a few benchmarks on v17 last week (sysbench oltp readonly, kerndevel from
mmtests, a memcg-ized version of the readtwice case I cooked up) and then today
discovered there's a chance I wasn't running the right kernels, so I'm redoing
them on v18.  Plan to look into what other, more "macro" tests would be
sensitive to these changes.