On Fri, Feb 11, 2022, Sean Christopherson wrote: > On Wed, Feb 09, 2022, Paolo Bonzini wrote: > > Right now, PGD caching requires a complicated dance of first computing > > the MMU role and passing it to __kvm_mmu_new_pgd, and then separately calling > > Nit, adding () after function names helps readers easily recognize when you're > taking about a specific function, e.g. as opposed to a concept or whatever. > > > kvm_init_mmu. > > > > Part of this is due to kvm_mmu_free_roots using mmu->root_level and > > mmu->shadow_root_level to distinguish whether the page table uses a single > > root or 4 PAE roots. Because kvm_init_mmu can overwrite mmu->root_level, > > kvm_mmu_free_roots must be called before kvm_init_mmu. > > > > However, even after kvm_init_mmu there is a way to detect whether the page table > > has a single root or four, because the pae_root does not have an associated > > struct kvm_mmu_page. > > Suggest a reword on the final paragraph, because there's a discrepancy with the > code (which handles 0, 1, or 4 "roots", versus just "single or four"). > > However, even after kvm_init_mmu() there is a way to detect whether the > page table may hold PAE roots, as root.hpa isn't backed by a shadow when > it points at PAE roots. > > > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > --- > > arch/x86/kvm/mmu/mmu.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 3c3f597ea00d..95d0fa0bb876 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -3219,12 +3219,15 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > > struct kvm *kvm = vcpu->kvm; > > int i; > > LIST_HEAD(invalid_list); > > - bool free_active_root = roots_to_free & KVM_MMU_ROOT_CURRENT; > > + bool free_active_root; > > > > BUILD_BUG_ON(KVM_MMU_NUM_PREV_ROOTS >= BITS_PER_LONG); > > > > /* Before acquiring the MMU lock, see if we need to do any real work. */ > > - if (!(free_active_root && VALID_PAGE(mmu->root.hpa))) { > > + free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT) > > + && VALID_PAGE(mmu->root.hpa); > > free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT) && > VALID_PAGE(mmu->root.hpa); > > Isn't this a separate bug fix? E.g. call kvm_mmu_unload() without a valid current > root, but with valid previous roots? In which case we'd try to free garbage, no? > > > + > > + if (!free_active_root) { > > for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) > > if ((roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) && > > VALID_PAGE(mmu->prev_roots[i].hpa)) > > @@ -3242,8 +3245,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > > &invalid_list); > > > > if (free_active_root) { > > - if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL && > > - (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) { > > + if (to_shadow_page(mmu->root.hpa)) { > > mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); > > } else if (mmu->pae_root) { Gah, this is technically wrong. It shouldn't truly matter, but it's wrong. root.hpa will not be backed by shadow page if the root is pml4_root or pml5_root, in which case freeing the PAE root is wrong. They should obviously be invalid already, but it's a little confusing because KVM wanders down a path that may not be relevant to the current mode. For clarity, I think it's worth doing: } else if (mmu->root.hpa == __pa(mmu->pae_root)) { > > for (i = 0; i < 4; ++i) { > > -- > > 2.31.1 > > > >