On Fri, Apr 26, 2024 at 8:28 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Fri, Apr 26, 2024 at 08:07:45AM -0700, Suren Baghdasaryan wrote: > > On Fri, Apr 26, 2024 at 7:00 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > Intel's 0day got back to me with data and it's ridiculously good. > > > Headline figure: over 3x throughput improvement with vm-scalability > > > https://lore.kernel.org/all/202404261055.c5e24608-oliver.sang@xxxxxxxxx/ > > > > > > I can't see why it's that good. It shouldn't be that good. I'm > > > seeing big numbers here: > > > > > > 4366 ą 2% +565.6% 29061 perf-stat.overall.cycles-between-cache-misses > > > > > > and the code being deleted is only checking vma->vm_ops and > > > vma->anon_vma. Surely that cache line is referenced so frequently > > > during pagefault that deleting a reference here will make no difference > > > at all? > > > > That indeed looks overly good. Sorry, I didn't have a chance to run > > the benchmarks on my side yet because of the ongoing Android bootcamp > > this week. > > No problem. Darn work getting in the way of having fun ;-) > > > > I still don't understand why we have to take the mmap_sem less often. > > > Is there perhaps a VMA for which we have a NULL vm_ops, but don't set > > > an anon_vma on a page fault? > > > > I think the only path in either do_anonymous_page() or > > do_huge_pmd_anonymous_page() that skips calling anon_vma_prepare() is > > the "Use the zero-page for reads" here: > > https://elixir.bootlin.com/linux/latest/source/mm/memory.c#L4265. I > > didn't look into this particular benchmark yet but will try it out > > once I have some time to benchmark your change. > > Yes, Liam and I had just brainstormed that as being a plausible > explanation too. I don't know how frequent it is to use anon memory > read-only. Presumably it must happen often enough that we've bothered > to implement the zero-page optimisation. But probably not nearly as > often as this benchmark makes it happen ;-) I also wonder if some of this improvement can be attributed to the last patch in your series (https://lore.kernel.org/all/20240426144506.1290619-5-willy@xxxxxxxxxxxxx/). I assume it was included in the 0day testing?