The TDP MMU has a performance regression compared to the legacy MMU when CR0 changes often. This was reported for the grsecurity kernel, which uses CR0.WP to implement kernel W^X. In that case, each change to CR0.WP unloads the MMU and causes a lot of unnecessary work. When running nested, this can even cause the L1 to hardly make progress, as the L0 hypervisor it is overwhelmed by the amount of MMU work that is needed. The root reason why kvm_mmu_reset_context calls kvm_mmu_unload is a subtlety of the implementation of fast PGD switching, which requires a call to kvm_mmu_new_pgd (and therefore knowing the new MMU role) *before* kvm_init_mmu. kvm_mmu_reset_context chickens out and does not do fast PGD switching at all, instead dropping all the roots. Therefore, the most important part of this series is a reorganization of fast PGD switching; it makes it possible to call kvm_mmu_new_pgd *after* the MMU has been set up, just using the MMU role instead of kvm_mmu_calc_root_page_role. Patches 1 and 2 are bugfixes found while working on the series. Patches 3 to 4 add more sanity checks that triggered a lot during development. Patches 5 to 7 are related cleanups. In particular patch 5 makes the cache lookup code a bit more pleasant. Patches 8 to 9 rework the fast PGD switching. Patches 10 and 11 are cleanups enabled by the rework, and the only survivors of the CPU role patchset. Patches 13 to 16 tidy up callers of kvm_mmu_reset_context and kvm_mmu_new_pgd. kvm_mmu_new_pgd is changed to use the ->get_guest_pgd callback, avoiding the possibility of confusion between the root_mmu and the guest_mmu, and a new request is created for it (this will also be put to use once the role patchset will allow automatic detection of changed MMU role). Finally, patch 17 changes callers that expect kvm_mmu_reset_context to perform a guest TLB flush, and patch 18 optimizes kvm_mmu_reset_context. Paolo Lai Jiangshan (1): KVM: x86/mmu: Do not use guest root level in audit Paolo Bonzini (17): KVM: x86: host-initiated EFER.LME write affects the MMU KVM: x86: do not deliver asynchronous page faults if CR0.PG=0 KVM: x86/mmu: WARN if PAE roots linger after kvm_mmu_unload KVM: x86/mmu: avoid NULL-pointer dereference on page freeing bugs KVM: x86/mmu: use struct kvm_mmu_root_info for mmu->root KVM: x86/mmu: do not consult levels when freeing roots KVM: x86/mmu: do not pass vcpu to root freeing functions KVM: x86/mmu: look for a cached PGD when going from 32-bit to 64-bit KVM: x86/mmu: load new PGD after the shadow MMU is initialized KVM: x86/mmu: Always use current mmu's role when loading new PGD KVM: x86/mmu: clear MMIO cache when unloading the MMU KVM: x86: reset and reinitialize the MMU in __set_sregs_common KVM: x86/mmu: avoid indirect call for get_cr3 KVM: x86/mmu: rename kvm_mmu_new_pgd, introduce variant that calls get_guest_pgd KVM: x86: introduce KVM_REQ_MMU_UPDATE_ROOT KVM: x86: flush TLB separately from MMU reset KVM: x86: do not unload MMU roots on all role changes arch/x86/include/asm/kvm_host.h | 10 +- arch/x86/kvm/mmu.h | 18 ++- arch/x86/kvm/mmu/mmu.c | 273 +++++++++++++++++--------------- arch/x86/kvm/mmu/mmu_audit.c | 16 +- arch/x86/kvm/mmu/paging_tmpl.h | 4 +- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- arch/x86/kvm/mmu/tdp_mmu.h | 2 +- arch/x86/kvm/svm/nested.c | 6 +- arch/x86/kvm/vmx/nested.c | 16 +- arch/x86/kvm/vmx/vmx.c | 2 +- arch/x86/kvm/x86.c | 135 ++++++++++------ 11 files changed, 279 insertions(+), 205 deletions(-) -- 2.31.1