On Thu, Nov 11, 2021, Lai Jiangshan wrote: > From: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx> > > For shadow paging, the pae_root needs to be reconstructed before the > coming VMENTER if the guest PDPTEs is changed. > > But not all paths that call load_pdptrs() will cause the pae_root to be > reconstructed. Normally, kvm_mmu_reset_context() and kvm_mmu_free_roots() > are used to launch later reconstruction. > > The commit d81135a57aa6("KVM: x86: do not reset mmu if CR0.CD and > CR0.NW are changed") skips kvm_mmu_reset_context() after load_pdptrs() > when changing CR0.CD and CR0.NW. > > The commit 21823fbda552("KVM: x86: Invalidate all PGDs for the current > PCID on MOV CR3 w/ flush") skips kvm_mmu_free_roots() after > load_pdptrs() when rewriting the CR3 with the same value. This isn't accurate, prior to that commit KVM wasn't guaranteed to do kvm_mmu_free_roots() if it got a hit on the current CR3 or if a previous CR3 in the cache matched the new CR3 (the "cache" has done some odd things in the past). So I think this particular flavor would be: Fixes: 7c390d350f8b ("kvm: x86: Add fast CR3 switch code path") > The commit a91a7c709600("KVM: X86: Don't reset mmu context when > toggling X86_CR4_PGE") skips kvm_mmu_reset_context() after > load_pdptrs() when changing CR4.PGE. > > Normally, the guest doesn't change the PDPTEs before doing only the > above operation without touching other bits that can force pae_root to > be reconstructed. Guests like linux would keep the PDPTEs unchaged > for every instance of pagetable. > > Fixes: d81135a57aa6("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed") > Fixes: 21823fbda552("KVM: x86: Invalidate all PGDs for the current PCID on MOV CR3 w/ flush") > Fixes: a91a7c709600("KVM: X86: Don't reset mmu context when toggling X86_CR4_PGE") > Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx> > --- > arch/x86/kvm/x86.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 0176eaa86a35..cfba337e46ab 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -832,8 +832,14 @@ int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3) > if (memcmp(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs))) { > memcpy(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs)); > kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR); > - /* Ensure the dirty PDPTEs to be loaded. */ > - kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu); > + /* > + * Ensure the dirty PDPTEs to be loaded for VMX with EPT > + * enabled or pae_root to be reconstructed for shadow paging. > + */ > + if (tdp_enabled) > + kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu); > + else > + kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, KVM_MMU_ROOT_CURRENT); Shouldn't matter since it's legacy shadow paging, but @mmu should be used instead of vcpu->arch.mmuvcpu->arch.mmu. To avoid a dependency on the previous patch, I think it makes sense to have this be: if (!tdp_enabled && memcmp(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs))) kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOT_CURRENT); before the memcpy(). Then we can decide independently if skipping the KVM_REQ_LOAD_MMU_PGD if the PDPTRs are unchanged with respect to the MMU is safe.