This patchset adds a fast path in KVM to test and clear access bits on sptes without taking the mmu_lock. It also adds support for using a bitmap to (1) test the access bits for many sptes in a single call to mmu_notifier_test_young, and to (2) clear the access bits for many ptes in a single call to mmu_notifier_clear_young. With Yu's permission, I'm now working on getting this series into a mergeable state. I'm posting this as an RFC because I'm not sure if the arm64 bits are correct, and I haven't done complete performance testing. I want to do broader experimentation to see how much this improves VM performance in a cloud environment, but I want to be sure that the code is mergeable first. Yu has posted other performance results[1], [2]. This v3 shouldn't significantly change the x86 results, but the arm64 results may have changed. The most important changes since v2[3]: - Split the test_clear_young MMU notifier back into test_young and clear_young. I did this because the bitmap passed in has a distinct meaning for each of them, and I felt that this was cleaner. - The return value of test_young / clear_young now indicates if the bitmap was used. - Removed the custom spte walker to implement the lockless path. This was important for arm64 to be functionally correct (thanks Oliver), and it avoids a lot of problems brought up in review of v2 (for example[4]). - Add kvm_arch_prepare_bitmap_age and kvm_arch_finish_bitmap_age to allow for arm64 to implement its bitmap-based aging to grab the MMU lock for reading while allowing x86 to be lockless. - The powerpc changes have been dropped. - The logic to inform architectures how to use the bitmap has been cleaned up (kvm_should_clear_young has been split into kvm_gfn_should_age and kvm_gfn_record_young) (thanks Nicolas). There were some smaller changes too: - Added test_clear_young_metadata (thanks Sean). - MMU_NOTIFIER_RANGE_LOCKLESS has been renamed to MMU_NOTIFIER_YOUNG_FAST, to indicate to the caller that passing a bitmap for MGLRU look-around is likely to be beneficial. - Cleaned up comments that describe the changes to mmu_notifier_test_young / mmu_notifier_clear_young (thanks Nicolas). [1]: https://lore.kernel.org/all/20230609005943.43041-1-yuzhao@xxxxxxxxxx/ [2]: https://lore.kernel.org/all/20230609005935.42390-1-yuzhao@xxxxxxxxxx/ [3]: https://lore.kernel.org/kvmarm/20230526234435.662652-1-yuzhao@xxxxxxxxxx/ [4]: https://lore.kernel.org/all/ZItX64Bbx5vdjo9M@xxxxxxxxxx/ James Houghton (5): mm: Add a bitmap into mmu_notifier_{clear,test}_young KVM: Move MMU notifier function declarations KVM: Add basic bitmap support into kvm_mmu_notifier_test/clear_young KVM: x86: Participate in bitmap-based PTE aging KVM: arm64: Participate in bitmap-based PTE aging Yu Zhao (2): KVM: x86: Move tdp_mmu_enabled and shadow_accessed_mask mm: multi-gen LRU: use mmu_notifier_test_clear_young() Documentation/admin-guide/mm/multigen_lru.rst | 6 +- arch/arm64/include/asm/kvm_host.h | 5 + arch/arm64/include/asm/kvm_pgtable.h | 4 +- arch/arm64/kvm/hyp/pgtable.c | 21 +- arch/arm64/kvm/mmu.c | 23 ++- arch/x86/include/asm/kvm_host.h | 20 ++ arch/x86/kvm/mmu.h | 6 - arch/x86/kvm/mmu/mmu.c | 16 +- arch/x86/kvm/mmu/spte.h | 1 - arch/x86/kvm/mmu/tdp_mmu.c | 10 +- include/linux/kvm_host.h | 101 ++++++++-- include/linux/mmu_notifier.h | 93 ++++++++- include/linux/mmzone.h | 6 +- include/trace/events/kvm.h | 13 +- mm/mmu_notifier.c | 20 +- mm/rmap.c | 9 +- mm/vmscan.c | 183 ++++++++++++++---- virt/kvm/kvm_main.c | 100 +++++++--- 18 files changed, 509 insertions(+), 128 deletions(-) base-commit: 0cef2c0a2a356137b170c3cb46cb9c1dd2ca3e6b -- 2.44.0.478.gd926399ef9-goog