On Fri, 02 Apr 2021 13:17:45 +0100, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 02/04/21 02:56, Sean Christopherson wrote: > > The end goal of this series is to optimize the MMU notifiers to take > > mmu_lock if and only if the notification is relevant to KVM, i.e. the hva > > range overlaps a memslot. Large VMs (hundreds of vCPUs) are very > > sensitive to mmu_lock being taken for write at inopportune times, and > > such VMs also tend to be "static", e.g. backed by HugeTLB with minimal > > page shenanigans. The vast majority of notifications for these VMs will > > be spurious (for KVM), and eliding mmu_lock for spurious notifications > > avoids an otherwise unacceptable disruption to the guest. > > > > To get there without potentially degrading performance, e.g. due to > > multiple memslot lookups, especially on non-x86 where the use cases are > > largely unknown (from my perspective), first consolidate the MMU notifier > > logic by moving the hva->gfn lookups into common KVM. > > > > Based on kvm/queue, commit 5f986f748438 ("KVM: x86: dump_vmcs should > > include the autoload/autostore MSR lists"). > > > > Well tested on Intel and AMD. Compile tested for arm64, MIPS, PPC, > > PPC e500, and s390. Absolutely needs to be tested for real on non-x86, > > I give it even odds that I introduced an off-by-one bug somewhere. > > > > v2: > > - Drop the patches that have already been pushed to kvm/queue. > > - Drop two selftest changes that had snuck in via "git commit -a". > > - Add a patch to assert that mmu_notifier_count is elevated when > > .change_pte() runs. [Paolo] > > - Split out moving KVM_MMU_(UN)LOCK() to __kvm_handle_hva_range() to a > > separate patch. Opted not to squash it with the introduction of the > > common hva walkers (patch 02), as that prevented sharing code between > > the old and new APIs. [Paolo] > > - Tweak the comment in kvm_vm_destroy() above the smashing of the new > > slots lock. [Paolo] > > - Make mmu_notifier_slots_lock unconditional to avoid #ifdefs. [Paolo] > > > > v1: > > - https://lkml.kernel.org/r/20210326021957.1424875-1-seanjc@xxxxxxxxxx > > > > Sean Christopherson (10): > > KVM: Assert that notifier count is elevated in .change_pte() > > KVM: Move x86's MMU notifier memslot walkers to generic code > > KVM: arm64: Convert to the gfn-based MMU notifier callbacks > > KVM: MIPS/MMU: Convert to the gfn-based MMU notifier callbacks > > KVM: PPC: Convert to the gfn-based MMU notifier callbacks > > KVM: Kill off the old hva-based MMU notifier callbacks > > KVM: Move MMU notifier's mmu_lock acquisition into common helper > > KVM: Take mmu_lock when handling MMU notifier iff the hva hits a > > memslot > > KVM: Don't take mmu_lock for range invalidation unless necessary > > KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if > > possible > > > > arch/arm64/kvm/mmu.c | 117 +++------ > > arch/mips/kvm/mmu.c | 97 ++------ > > arch/powerpc/include/asm/kvm_book3s.h | 12 +- > > arch/powerpc/include/asm/kvm_ppc.h | 9 +- > > arch/powerpc/kvm/book3s.c | 18 +- > > arch/powerpc/kvm/book3s.h | 10 +- > > arch/powerpc/kvm/book3s_64_mmu_hv.c | 98 ++------ > > arch/powerpc/kvm/book3s_64_mmu_radix.c | 25 +- > > arch/powerpc/kvm/book3s_hv.c | 12 +- > > arch/powerpc/kvm/book3s_pr.c | 56 ++--- > > arch/powerpc/kvm/e500_mmu_host.c | 27 +- > > arch/x86/kvm/mmu/mmu.c | 127 ++++------ > > arch/x86/kvm/mmu/tdp_mmu.c | 245 +++++++------------ > > arch/x86/kvm/mmu/tdp_mmu.h | 14 +- > > include/linux/kvm_host.h | 22 +- > > virt/kvm/kvm_main.c | 325 +++++++++++++++++++------ > > 16 files changed, 552 insertions(+), 662 deletions(-) > > > > For MIPS, I am going to post a series that simplifies TLB flushing > further. I applied it, and rebased this one on top, to > kvm/mmu-notifier-queue. > > Architecture maintainers, please look at the branch and > review/test/ack your parts. I've given this a reasonably good beating on arm64 for both VHE and nVHE HW, and nothing caught fire, although I was left with a conflict in the x86 code after merging with linux/master. Feel free to add a Tested-by: Marc Zyngier <maz@xxxxxxxxxx> for the arm64 side. M. -- Without deviation from the norm, progress is not possible.