On Mon, 24 Aug 2020, Andrew Morton wrote: > On Mon, 24 Aug 2020 20:54:33 +0800 Alex Shi <alex.shi@xxxxxxxxxxxxxxxxx> wrote: > > > The new version which bases on v5.9-rc2. Well timed and well based, thank you Alex. Particulary helpful to me, to include those that already went into mmotm: it's a surer foundation to test on top of the -rc2 base. > > the first 6 patches was picked into > > linux-mm, and add patch 25-32 that do some further post optimization. > > 32 patches, version 18. That's quite heroic. I'm unsure whether I > should merge it up at this point - what do people think? I'd love for it to go into mmotm - but not today. Version 17 tested out well. I've only just started testing version 18, but I'm afraid there's been a number of "improvements" in between, which show up as warnings (lots of VM_WARN_ON_ONCE_PAGE(!memcg) - I think one or more of those are already in mmotm and under discussion on the list, but I haven't read through yet, and I may have caught more cases to examine; a per-cpu warning from munlock_vma_page(); something else flitted by at reboot time before I could read it). No crashes so far, but I haven't got very far with it yet. I'll report back later in the week. Andrew demurred on version 17 for lack of review. Alexander Duyck has been doing a lot on that front since then. I have intended to do so, but it's a mirage that moves away from me as I move towards it: I have some time in the coming weeks to get back to that, but it would help me if the series is held more static by being in mmotm - we may need fixes, but improvements are liable to get in the way of finalizing. I still find the reliance on TestClearPageLRU, rather than lru_lock, hard to wrap my head around: but for so long as it's working correctly, please take that as a problem with my head (and something we can certainly change later if necessary, by re-adding the use of lru_lock in certain places (or by fitting me with a new head)). > > > > > Following Daniel Jordan's suggestion, I have run 208 'dd' with on 104 > > containers on a 2s * 26cores * HT box with a modefied case: > > https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice > > With this patchset, the readtwice performance increased about 80% > > in concurrent containers. > > That's rather a slight amount of performance testing for a huge > performance patchset! Indeed. And I see that clause about readtwice performance increased 80% going back eight months to v6: a lot of fundamental bugs have been fixed in it since then, so I do think it needs refreshing. It could be faster now: v16 or v17 fixed the last bug I knew of, which had been slowing down reclaim considerably. When I last timed my repetitive swapping loads (not loads anyone sensible would be running with), across only two memcgs, Alex's patchset was slightly faster than without: it really did make a difference. But I tend to think that for all patchsets, there exists at least one test that shows it faster, and another that shows it slower. > Is more detailed testing planned? Not by me, performance testing is not something I trust myself with, just get lost in the numbers: Alex, this is what we hoped for months ago, please make a more convincing case, I hope Daniel and others can make more suggestions. But my own evidence suggests it's good. Hugh