On Tue, Jan 17, 2023, Mathias Krause wrote: > There is no need to unload the MMU roots when only CR0.WP has changed -- > the paging structures are still valid, only the permission bitmap needs > to be updated. This doesn't hold true when KVM is using shadow paging, in which case CR0.WP affects the shadow page tables. I believe that also holds true for nNPT :-( nEPT doesn't consume CR0.WP so we could expedite that case as well, though identifying that case might be annoying. > Change kvm_mmu_reset_context() to get passed the need for unloading MMU > roots and explicitly avoid it if only CR0.WP was toggled on a CR0 write > caused VMEXIT. One thing we should explore on top of this is not intercepting CR0.WP (on Intel) when TDP is enabled. It could even trigger after toggling CR0.WP N times, e.g. to optimize the grsecurity use case without negatively impacting workloads with a static CR0.WP, as walking guest memory would require an "extra" VMREAD to get CR0.WP in that case. Unfortunately, AMD doesn't provide per-bit controls. > This change brings a huge performance gain as the following micro- > benchmark running 'ssdd 10 50000' from rt-tests[1] on a grsecurity L1 VM > shows (runtime in seconds, lower is better): > > legacy MMU TDP MMU > kvm.git/queue 11.55s 13.91s > kvm.git/queue+patch 7.44s 7.94s > > For legacy MMU this is ~35% faster, for TTP MMU ~43% faster. > > [1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git > > Signed-off-by: Mathias Krause <minipli@xxxxxxxxxxxxxx> > --- ... > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 508074e47bc0..d7c326ab94de 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -902,7 +902,9 @@ EXPORT_SYMBOL_GPL(load_pdptrs); > > void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned long cr0) > { > - if ((cr0 ^ old_cr0) & X86_CR0_PG) { > + unsigned long cr0_change = cr0 ^ old_cr0; > + > + if (cr0_change & X86_CR0_PG) { > kvm_clear_async_pf_completion_queue(vcpu); > kvm_async_pf_hash_reset(vcpu); > > @@ -914,10 +916,18 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned lon > kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); > } > > - if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS) > - kvm_mmu_reset_context(vcpu); > + if (cr0_change & KVM_MMU_CR0_ROLE_BITS) { > + bool unload_mmu = > + cr0_change & (KVM_MMU_CR0_ROLE_BITS & ~X86_CR0_WP); As above, this needs to guarded with a check that the MMU is direct. And rather than add a flag to kvm_mmu_reset_context(), just call kvm_init_mmu() directly. E.g. I think this would work? diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d07563d0e204..8f9fac6d81d2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -927,6 +927,11 @@ EXPORT_SYMBOL_GPL(load_pdptrs); void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned long cr0) { + if (vcpu->arch.mmu->root_role.direct && (cr0 ^ old_cr0) == X86_CR0_WP) { + kvm_init_mmu(vcpu); + return; + } + if ((cr0 ^ old_cr0) & X86_CR0_PG) { kvm_clear_async_pf_completion_queue(vcpu); kvm_async_pf_hash_reset(vcpu);