On 3/18/22 17:48, Paolo Bonzini wrote:
This reverts commit cf3e26427c08ad9015956293ab389004ac6a338e.
Multi-vCPU Hyper-V guests started crashing randomly on boot with the
latest kvm/queue and the problem can be bisected the problem to this
particular patch. Basically, I'm not able to boot e.g. 16-vCPU guest
successfully anymore. Both Intel and AMD seem to be affected. Reverting
the commit saves the day.
Reported-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
This is not enough, the following is also needed to account
for "KVM: x86/mmu: Defer TLB flush to caller when freeing TDP MMU shadow
pages":
------------------- 8< ----------------
From: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Subject: [PATCH] kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU
Since "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()"
is going to be reverted, it's not going to be true anymore that
the zap-page flow does not free any 'struct kvm_mmu_page'. Introduce
an early flush before tdp_mmu_zap_leafs() returns, to preserve
bisectability.
Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index af60922906ef..7f63e1a704e3 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -941,13 +941,17 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root,
flush = true;
}
+ /*
+ * Need to flush before releasing RCU. TODO: do it only if intermediate
+ * page tables were zapped; there is no need to flush under RCU protection
+ * if no 'struct kvm_mmu_page' is freed.
+ */
+ if (flush)
+ kvm_flush_remote_tlbs_with_address(kvm, start, end - start);
+
rcu_read_unlock();
- /*
- * Because this flow zaps _only_ leaf SPTEs, the caller doesn't need
- * to provide RCU protection as no 'struct kvm_mmu_page' will be freed.
- */
- return flush;
+ return false;
}
/*