On 12/11/21 03:39, Sean Christopherson wrote:
That means that KVM (a) is somehow losing track of a root, (b) isn't zapping all
SPTEs in kvm_mmu_zap_all(), or (c) is installing a SPTE after the mm has been released.
(a) is unlikely because kvm_tdp_mmu_get_vcpu_root_hpa() is the only way for a
vCPU to get a reference, and it holds mmu_lock for write, doesn't yield, and
either gets a root from the list or adds a root to the list.
(b) is unlikely because I would expect the fallout to be much larger and not
unique to your setup.
Hmm, I think it's kvm_mmu_zap_all() skipping invalidated roots. One fix
could be the following - untested and uncompiled, after all it's Saturday.
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 7c5dd83e52de..2e05b6a815b6 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -781,18 +781,6 @@ bool __kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, int as_id, gfn_t start,
return flush;
}
-void kvm_tdp_mmu_zap_all(struct kvm *kvm)
-{
- bool flush = false;
- int i;
-
- for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
- flush = kvm_tdp_mmu_zap_gfn_range(kvm, i, 0, -1ull, flush);
-
- if (flush)
- kvm_flush_remote_tlbs(kvm);
-}
-
static struct kvm_mmu_page *next_invalidated_root(struct kvm *kvm,
struct kvm_mmu_page *prev_root)
{
@@ -859,6 +847,33 @@ void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
kvm_flush_remote_tlbs(kvm);
}
+void kvm_tdp_mmu_zap_all(struct kvm *kvm)
+{
+ struct kvm_mmu_page *root, *next_root;
+ bool flush = false;
+
+ /*
+ * We need to zap all roots, including already-invalid ones. The
+ * easiest way is to ensure there's only invalid roots which then,
+ * for efficiency, we zap with shared==false unlike
+ * kvm_tdp_mmu_zap_invalidated_roots.
+ */
+ kvm_tdp_mmu_invalidate_all_roots(kvm);
+
+ root = next_invalidated_root(kvm, NULL);
+ while (root) {
+ flush = zap_gfn_range(kvm, root, 0, -1ull, true, flush, false);
+ next_root = next_invalidated_root(kvm, root);
+
+ /* Put the reference acquired in kvm_tdp_mmu_invalidate_roots. */
+ kvm_tdp_mmu_put_root(kvm, root, false);
+ root = next_root;
+ }
+
+ if (flush)
+ kvm_flush_remote_tlbs(kvm);
+}
+
/*
* Mark each TDP MMU root as invalid so that other threads
* will drop their references and allow the root count to
Paolo