On Thu, 2022-02-17 at 16:03 -0500, Paolo Bonzini wrote: > kvm_mmu_reset_context is called on all role changes and right now it > calls kvm_mmu_unload. With the legacy MMU this is a relatively cheap > operation; the previous PGDs remains in the hash table and is picked > up immediately on the next page fault. With the TDP MMU, however, the > roots are thrown away for good and a full rebuild of the page tables is > necessary, which is many times more expensive. > > Fortunately, throwing away the roots is not necessary except when > the manual says a TLB flush is required: Actually does TLB flush throw away the roots? I think we only sync them and keep on using them? (kvm_vcpu_flush_tlb_guest) I can't be 100% sure but this patch > > - changing CR0.PG from 1 to 0 (because it flushes the TLB according to > the x86 architecture specification) > > - changing CPUID (which changes the interpretation of page tables in > ways not reflected by the role). > > - changing CR4.SMEP from 0 to 1 (not doing so actually breaks access.c) > > Except for these cases, once the MMU has updated the CPU/MMU roles > and metadata it is enough to force-reload the current value of CR3. > KVM will look up the cached roots for an entry with the right role and > PGD, and only if the cache misses a new root will be created. > > Measuring with vmexit.flat from kvm-unit-tests shows the following > improvement: > > TDP legacy shadow > before 46754 5096 5150 > after 4879 4875 5006 > > which is for very small page tables. The impact is however much larger > when running as an L1 hypervisor, because the new page tables cause > extra work for L0 to shadow them. > > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > --- > arch/x86/kvm/mmu/mmu.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index c44b5114f947..913cc7229bf4 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -5043,8 +5043,8 @@ EXPORT_SYMBOL_GPL(kvm_init_mmu); > void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) > { > /* > - * Invalidate all MMU roles to force them to reinitialize as CPUID > - * information is factored into reserved bit calculations. > + * Invalidate all MMU roles and roots to force them to reinitialize, > + * as CPUID information is factored into reserved bit calculations. > * > * Correctly handling multiple vCPU models with respect to paging and > * physical address properties) in a single VM would require tracking > @@ -5057,6 +5057,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) > vcpu->arch.root_mmu.mmu_role.ext.valid = 0; > vcpu->arch.guest_mmu.mmu_role.ext.valid = 0; > vcpu->arch.nested_mmu.mmu_role.ext.valid = 0; > + kvm_mmu_unload(vcpu); > kvm_mmu_reset_context(vcpu); > > /* > @@ -5068,8 +5069,8 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) > > void kvm_mmu_reset_context(struct kvm_vcpu *vcpu) > { > - kvm_mmu_unload(vcpu); > kvm_init_mmu(vcpu); > + kvm_make_request(KVM_REQ_MMU_UPDATE_ROOT, vcpu); > } > EXPORT_SYMBOL_GPL(kvm_mmu_reset_context); > How about call to kvm_mmu_reset_context in nested_vmx_restore_host_state This is a failback path though. Best regards, Maxim Levitsky