[PATCH] KVM: nVMX: Do not recalc IOAPIC handled vectors while running L2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When L1 IOAPIC redirection-table is written, a request of
KVM_REQ_SCAN_IOAPIC is set on all vCPUs. This is done such that
all vCPUs will now recalc their IOAPIC handled vectors.

However, it could be that one of the vCPUs is currently running
L2. In this case, vcpu_scan_ioapic() will be called while
is_guest_mode(vcpu) == true. In this case, load_eoi_exitmap()
will be called which would write to vmcs02->eoi_exit_bitmap,
which is wrong because vmcs02->eoi_exit_bitmap should always
be equal to vmcs12->eoi_exit_bitmap.
Furthermore, at this point KVM_REQ_SCAN_IOAPIC was already
consumed and therefore we will never update vmcs01->eoi_exit_bitmap.
Which could lead to remote_irr of some IOAPIC level-triggered entry
to remain set forever.

Fix this issue by delaying KVM_REQ_SCAN_IOAPIC processing to execute
only when running L1 (is_guest_mode(vcpu) == false).

Issue was reproduced with the following setup:
* L0 runs KVM with 64 CPUs
* L1 runs ESXi 6.0 with 8 CPUs
* ESXi runs 4 L2 VMs:
1. Windows 8.1 32bit with 4 CPUs
2. Ubuntu 17 Server with 4 CPUs
3. Ubuntu Desktop with 2 CPUs
4. CentOS 32bit with 1 CPU
A short while after booting all the L2 VMs, ESXi lost networking.
Examining the issue revealed that ESXi dynamically reconfigures
the IOAPIC redirection-table entry of the NIC. Shortly after
leading to that entry's remote_irr being set forever.

Signed-off-by: Liran Alon <liran.alon@xxxxxxxxxx>
Reviewed-by: Arbel Moshe <arbel.moshe@xxxxxxxxxx>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@xxxxxxxxxx>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@xxxxxxxxxx>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@xxxxxxxxxx>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/x86.c              | 10 +++++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c73e493adf07..ceb8beb1bfc9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -498,6 +498,7 @@ struct kvm_vcpu_arch {
 	u64 apic_base;
 	struct kvm_lapic *apic;    /* kernel irqchip context */
 	bool apicv_active;
+	bool scan_ioapic_pending;
 	DECLARE_BITMAP(ioapic_handled_vectors, 256);
 	unsigned long apic_attention;
 	int32_t apic_arb_prio;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 03869eb7fcd6..ac1339148a9a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6720,6 +6720,12 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	if (!kvm_apic_hw_enabled(vcpu->arch.apic))
 		return;
 
+	if (is_guest_mode(vcpu)) {
+		vcpu->arch.scan_ioapic_pending = true;
+		return;
+	}
+	vcpu->arch.scan_ioapic_pending = false;
+
 	bitmap_zero(vcpu->arch.ioapic_handled_vectors, 256);
 
 	if (irqchip_split(vcpu->kvm))
@@ -6833,7 +6839,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 				goto out;
 			}
 		}
-		if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
+		if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu) ||
+		    (!is_guest_mode(vcpu) && vcpu->arch.scan_ioapic_pending))
 			vcpu_scan_ioapic(vcpu);
 		if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu))
 			kvm_vcpu_reload_apic_access_page(vcpu);
@@ -7981,6 +7988,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 	kvm = vcpu->kvm;
 
 	vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu);
+	vcpu->arch.scan_ioapic_pending = false;
 	vcpu->arch.pv.pv_unhalted = false;
 	vcpu->arch.emulate_ctxt.ops = &emulate_ops;
 	if (!irqchip_in_kernel(kvm) || kvm_vcpu_is_reset_bsp(vcpu))
-- 
1.9.1




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux