Re: [PATCH v10 047/108] KVM: x86/tdp_mmu: Don't zap private pages for unsupported cases

Ackerley Tng <ackerleytng@xxxxxxxxxx> · Tue, 22 Nov 2022 11:37:41 -0800

Subject: Re: [PATCH v10 047/108] KVM: x86/tdp_mmu: Don't zap private
pages for unsupported cases

> From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>
> TDX supports only write-back(WB) memory type for private memory
> architecturally so that (virtualized) memory type change doesn't make sense
> for private memory.  Also currently, page migration isn't supported for TDX
> yet. (TDX architecturally supports page migration. it's KVM and kernel
> implementation issue.)
>
> Regarding memory type change (mtrr virtualization and lapic page mapping
> change), pages are zapped by kvm_zap_gfn_range().  On the next KVM page
> fault, the SPTE entry with a new memory type for the page is populated.
> Regarding page migration, pages are zapped by the mmu notifier. On the next
> KVM page fault, the new migrated page is populated.  Don't zap private
> pages on unmapping for those two cases.
>
> When deleting/moving a KVM memory slot, zap private pages. Typically
> tearing down VM.  Don't invalidate private page tables. i.e. zap only leaf
> SPTEs for KVM mmu that has a shared bit mask. The existing
> kvm_tdp_mmu_invalidate_all_roots() depends on role.invalid with read-lock
> of mmu_lock so that other vcpu can operate on KVM mmu concurrently.  It
> marks the root page table invalid and zaps SPTEs of the root page
> tables. The TDX module doesn't allow to unlink a protected root page table
> from the hardware and then allocate a new one for it. i.e. replacing a
> protected root page table.  Instead, zap only leaf SPTEs for KVM mmu with a
> shared bit mask set.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> ---
>  arch/x86/kvm/mmu/mmu.c     | 85 ++++++++++++++++++++++++++++++++++++--
>  arch/x86/kvm/mmu/tdp_mmu.c | 24 ++++++++---
>  arch/x86/kvm/mmu/tdp_mmu.h |  5 ++-
>  3 files changed, 103 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index faf69774c7ce..0237e143299c 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1577,8 +1577,38 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
>  if (kvm_memslots_have_rmaps(kvm))
>  flush = kvm_handle_gfn_range(kvm, range, kvm_zap_rmap);
>
> - if (is_tdp_mmu_enabled(kvm))
> - flush = kvm_tdp_mmu_unmap_gfn_range(kvm, range, flush);
> + if (is_tdp_mmu_enabled(kvm)) {
> + bool zap_private;

We should initialize zap_private to true, otherwise zap_private is
uninitialized in call

    kvm_tdp_mmu_unmap_gfn_range(kvm, range, flush, zap_private)

if the condition

    if (kvm_slot_can_be_private(range->slot)) {

evaluates to false.

> +
> + if (kvm_slot_can_be_private(range->slot)) {
> + if (range->flags & KVM_GFN_RANGE_FLAGS_RESTRICTED_MEM)
> + /*
> + * For private slot, the callback is triggered
> + * via falloc.  Mode can be allocation or punch
> + * hole.  Because the private-shared conversion
> + * is done via
> + * KVM_MEMORY_ENCRYPT_REG/UNREG_REGION, we can
> + * ignore the request from restrictedmem.
> + */
> + return flush;
> + else if (range->flags & KVM_GFN_RANGE_FLAGS_SET_MEM_ATTR) {
> + if (range->attr == KVM_MEM_ATTR_SHARED)
> + zap_private = true;
> + else {
> + WARN_ON_ONCE(range->attr != KVM_MEM_ATTR_PRIVATE);
> + zap_private = false;
> + }
> + } else
> + /*
> + * kvm_unmap_gfn_range() is called via mmu
> + * notifier.  For now page migration for private
> + * page isn't supported yet, don't zap private
> + * pages.
> + */
> + zap_private = false;
> + }
> + flush = kvm_tdp_mmu_unmap_gfn_range(kvm, range, flush, zap_private);
> + }
>
>  return flush;
>  }