On Fri, Aug 16, 2024 at 6:46 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Thu, Jul 25, 2024, James Houghton wrote: > > On Tue, Jul 23, 2024 at 6:11 PM James Houghton <jthoughton@xxxxxxxxxx> wrote: > > > > > > Replace the MMU write locks (taken in the memslot iteration loop) for > > > read locks. > > > > > > Grabbing the read lock instead of the write lock is safe because the > > > only requirement we have is that the stage-2 page tables do not get > > > deallocated while we are walking them. The stage2_age_walker() callback > > > is safe to race with itself; update the comment to reflect the > > > synchronization change. > > > > > > Signed-off-by: James Houghton <jthoughton@xxxxxxxxxx> > > > --- > > > > Here is some data to show that this patch at least *can* be helpful: > > > > # arm64 patched to do aging (i.e., set HAVE_KVM_MMU_NOTIFIER_YOUNG_FAST_ONLY) > > # The test is faulting memory in while doing aging as fast as possible. > > # taskset -c 0-32 ./access_tracking_perf_test -l -r /dev/cgroup/memory > > -p -v 32 -m 3 > > > > # Write lock > > vcpu wall time : 3.039207157s > > lru_gen avg pass duration : 1.660541541s, (passes:2, total:3.321083083s) > > > > # Read lock > > vcpu wall time : 3.010848445s > > lru_gen avg pass duration : 0.306623698s, (passes:11, total:3.372860688s) > > > > Aging is able to run significantly faster, but vCPU runtime isn't > > affected much (in this test). > > Were you expecting vCPU runtime to improve (more)? If so, lack of movement could > be due to KVM arm64 taking mmap_lock for read when handling faults: > > https://lore.kernel.org/all/Zr0ZbPQHVNzmvwa6@xxxxxxxxxx For the above test, I don't think it's mmap_lock -- the reclaim path, e.g., when zswapping guest memory, has two stages: aging (scanning PTEs) and eviction (unmapping PTEs). Only testing the former isn't realistic at all. IOW, for a r/w lock use case, only testing the read lock path would be bad coverage.