On 06/12/17 22:20, Radim Krčmář wrote:
2017-12-05 10:16+0200, Liran Alon:
Before each vmentry to guest, vcpu_enter_guest() calls sync_pir_to_irr()
which calls vmx_hwapic_irr_update() to update RVI.
Currently, vmx_hwapic_irr_update() contains a tweak in case it is called
when CPU is running L2 and L1 don't intercept external-interrupts.
In that case, code injects interrupt directly into L2 instead of
updating RVI.
Besides being hacky (wouldn't expect function updating RVI to also
inject interrupt), it also doesn't handle this case correctly.
The code contains several issues:
1. When code calls kvm_queue_interrupt() it just passes it max_irr which
represents the highest IRR currently pending in L1 LAPIC.
This is problematic as interrupt was injected to guest but it's bit is
still set in LAPIC IRR instead of being cleared from IRR and set in ISR.
2. Code doesn't check if LAPIC PPR is set to accept an interrupt of
max_irr priority. It just checks if interrupts are enabled in guest with
vmx_interrupt_allowed().
Good catch.
To fix the above issues:
1. Simplify vmx_hwapic_irr_update() to just update RVI.
Note that this shouldn't happen when CPU is running L2
(See comment in code).
2. Since now vmx_hwapic_irr_update() only does logic for L1
virtual-interrupt-delivery, inject_pending_event() should be the
one responsible for injecting the interrupt directly into L2.
Therefore, change kvm_cpu_has_injectable_intr() to check L1
LAPIC when CPU is running L2.
This is enough to fix the issue since commit ("KVM: nVMX: Re-evaluate
L1 pending events when running L2 and L1 got posted-interrupt")
made vcpu_enter_guest() to set KVM_REQ_EVENT when L1 has a pending
injectable interrupt.
Right, that answers my question from the first patch.
Fixes: 963fee165660 ("KVM: nVMX: Fix virtual interrupt delivery
injection")
Signed-off-by: Liran Alon <liran.alon@xxxxxxxxxx>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@xxxxxxxxxx>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@xxxxxxxxxx>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@xxxxxxxxxx>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
---
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 5c24811e8b0b..f171051eecf3 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -79,7 +79,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v)
if (kvm_cpu_has_extint(v))
return 1;
- if (kvm_vcpu_apicv_active(v))
+ if (!is_guest_mode(v) && kvm_vcpu_apicv_active(v))
return 0;
return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 47bbb8b691e8..bbc023fb9ef1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9002,30 +9002,18 @@ static void vmx_set_rvi(int vector)
static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
{
- if (!is_guest_mode(vcpu)) {
- vmx_set_rvi(max_irr);
- return;
- }
-
- if (max_irr == -1)
- return;
-
/*
- * In guest mode. If a vmexit is needed, vmx_check_nested_events
- * handles it.
+ * When running L2, updating RVI is only relevant when
+ * vmcs12 virtual-interrupt-delivery enabled.
+ * However, it can be enabled only when L1 also
+ * intercepts external-interrupts and in that case
+ * we should not update vmcs02 RVI but instead intercept
+ * interrupt. Therefore, do nothing when running L2.
*/
- if (nested_exit_on_intr(vcpu))
+ if ((max_irr == -1) || is_guest_mode(vcpu))
return;
The logic should remain
if (!is_guest_mode(vcpu))
vmx_set_rvi(max_irr);
because we had a bug where RVI was not cleared on reset and I don't
think we fixed it elsewhere. 4114c27d450b ("KVM: x86: reset RVI upon
system reset")
I see. You are right. I wasn't aware of that commit. Will change.
- /*
- * Else, fall back to pre-APICv interrupt injection since L2
- * is run without virtual interrupt delivery.
- */
- if (!kvm_event_needs_reinjection(vcpu) &&
- vmx_interrupt_allowed(vcpu)) {
- kvm_queue_interrupt(vcpu, max_irr, false);
- vmx_inject_irq(vcpu);
- }
+ vmx_set_rvi(max_irr);
}
static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
@@ -9051,6 +9039,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
* If we are running L2 and L1 has a new pending interrupt
* which can be injected, we should re-evaluate
* what should be done with this new L1 interrupt.
+ * If L1 intercepts external-interrupts, we should
+ * exit from L2 to L1. Otherwise, interrupt should be
+ * delivered directly to L2.
+ * These cases are handled correctly by KVM_REQ_EVENT.
*/
if (is_guest_mode(vcpu) && (max_irr > prev_max_irr))
kvm_make_request(KVM_REQ_EVENT, vcpu);
KVM_REQ_EVENT is not needed when we are intercepting the interrupt,
because we just want to run vmx_check_nested_events(), but it is needed
when delivering L1's interrupt inside L2.
OK. If vmcs12 intercept external interrupts, I will just call
kvm_vcpu_exiting_guest_mode(). Otherwise, I will use KVM_REQ_EVENT.
When vmcs12 doesn't intercept external interrupts, I'm thinking we could
pass virtual interrupts from vmcs01 and use RVI normally, as well as
delivering L1 posted interrupts to L2 without a VM exit.
Maybe I am missing something but I am not sure this is possible.
As the comment I added to vmx_hwapic_irr_update() specifies,
virtual-interrupt-delivery cannot be enabled if exit on
external-interrupts is disabled. This can be seen in the following code
in nested_vmx_check_apicv_controls():
/*
* If virtual interrupt delivery is enabled,
* we must exit on external interrupts.
*/
if (nested_cpu_has_vid(vmcs12) &&
!nested_exit_on_intr(vcpu))
return -EINVAL;
Therefore, in case vmcs12 disables exit on
external-interrupts then vmcs12 SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY
must be cleared. As prepare_vmcs02() currently takes value of
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY as-is from vmcs12 to vmcs02, it
means vmcs02 doesn't have virtual-interrupt-delivery in this case.
Therefore, in this case, we should not update vmcs02 RVI.
Regards,
-Liran