On Fri, Jul 01, 2022 at 10:41:08PM +1200, Kai Huang <kai.huang@xxxxxxxxx> wrote: > On Mon, 2022-06-27 at 14:53 -0700, isaku.yamahata@xxxxxxxxx wrote: > > From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > > > For kvm mmu that has shared bit mask, zap only leaf SPTEs when > > deleting/moving a memslot. The existing kvm_mmu_zap_memslot() depends on > > Unless I am mistaken, I don't see there's an 'existing' kvm_mmu_zap_memslot(). Oops. it should be kvm_tdp_mmu_invalidate_all_roots(). > > role.invalid with read lock of mmu_lock so that other vcpu can operate on > > kvm mmu concurrently. > > > > > Mark the root page table invalid, unlink it from page > > table pointer of CPU, process the page table. > > > > Are you talking about the behaviour of existing code, or the change you are > going to make? Looks like you mean the latter but I believe it's the former. The existing code. The should "It marks ...". > > It doesn't work for private > > page table to unlink the root page table because it requires all SPTE entry > > to be non-present. > > > > I don't think we can truly *unlink* the private root page table from secure > EPTP, right? The EPTP (root table) is fixed (and hidden) during TD's runtime. > > I guess you are trying to say: removing/unlinking one secure-EPT page requires > removing/unlinking all its children first? That's right. I'll update it as follows. > So the reason to only zap leaf is we cannot truly unlink the private root page > table, correct? Sorry your changelog is not obvious to me. The reason is, to unlink a page table from the parent's SPTE, all SPTEs of the page table need to be non-present. Here is the updated commit message. KVM: x86/mmu: Zap only leaf SPTEs for deleted/moved memslot for private mmu| For kvm mmu that has shared bit mask, zap only leaf SPTEs when | deleting/moving a memslot. The existing kvm_tdp_mmu_invalidate_all_roots()| depends on role.invalid with read lock of mmu_lock so that other vcpu can | operate on kvm mmu concurrently. It marks the root page table invalid, | zaps SPTEs of the root page tables. | | It doesn't work to unlink a private page table from the parent's SPTE entry| because it requires all SPTE entry of the page table to be non-present. | Instead, with write-lock of mmu_lock and zap only leaf SPTEs for kvm mmu | with shared bit mask. > > Instead, with write-lock of mmu_lock and zap only leaf > > SPTEs for kvm mmu with shared bit mask. > > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > --- > > arch/x86/kvm/mmu/mmu.c | 35 ++++++++++++++++++++++++++++++++++- > > 1 file changed, 34 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 80d7c7709af3..c517c7bca105 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -5854,11 +5854,44 @@ static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm) > > return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages)); > > } > > > > +static void kvm_mmu_zap_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) > > +{ > > + bool flush = false; > > + > > + write_lock(&kvm->mmu_lock); > > + > > + /* > > + * Zapping non-leaf SPTEs, a.k.a. not-last SPTEs, isn't required, worst > > + * case scenario we'll have unused shadow pages lying around until they > > + * are recycled due to age or when the VM is destroyed. > > + */ > > + if (is_tdp_mmu_enabled(kvm)) { > > + struct kvm_gfn_range range = { > > + .slot = slot, > > + .start = slot->base_gfn, > > + .end = slot->base_gfn + slot->npages, > > + .may_block = false, > > + }; > > + > > + flush = kvm_tdp_mmu_unmap_gfn_range(kvm, &range, flush); > > > It appears you only unmap private GFNs (because the base_gfn doesn't have shared > bit)? I think shared mapping in this slot must be zapped too? > > How is this done? Or the kvm_tdp_mmu_unmap_gfn_range() also zaps shared > mappings? kvm_tdp_mmu_unmap_gfn_range() handles both private gfn and shared gfn as they are alias. > It's hard to review if one patch's behaviour/logic depends on further patches. I'll add a comment on the call. -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>