Consider the case L2 exits to L0 during event-delivery. In this case, vmx_complete_interrupts() will see IDT-vectoring-info is valid and therefore update KVM structs for event reinjection on next L2 resume. Assume that before L0 reaches vcpu_enter_guest(), another L1 CPU sends an IPI via virtual-posted-interrupts. That CPU will write a new vector to destination nested.pi_desc->pir and then will trigger an IPI with vector nested.posted_intr_nv. This will reach vmx_deliver_nested_posted_interrupt() which won't send a physical IPI (as vcpu->mode != IN_GUEST_MODE) but instead just signal nested.pi_pending=true and set KVM_REQ_EVENT. When destination CPU will reach vcpu_enter_guest(), it will consume the KVM_REQ_EVENT and call inject_pending_event() which will call check_nested_events(). However, because we have an event for reinjection to L2, vmx_check_nested_events() will return before calling vmx_complete_nested_posted_interrupt()! Therefore, not updating L1 virtual-apic-page and vmcs02's RVI. Assume that at this point we exit L2 and some L1 interrupt is raised afterwards (For example, another L1 CPU IPI). We will reach again vcpu_enter_guest() and call check_nested_events() that will exit from L2 to L1 due to pending interrupt and return from check_nested_events(). Again, without calling vmx_complete_nested_posted_interrupts()! At this point KVM_REQ_EVENT was already consumed and therefore cleared. When L1 will again VMRESUME into L2, it will run L2 without updated virtual-apic-page, with bad RVI and with PIR.ON set. Which is of course a bug... Fix this entire complex issue by just make vmx_check_nested_events() always call vmx_complete_nested_posted_interrupt(). Fixes: 705699a13994 ("KVM: nVMX: Enable nested posted interrupt processing") Signed-off-by: Liran Alon <liran.alon@xxxxxxxxxx> Reviewed-by: Nikita Leshenko <nikita.leshchenko@xxxxxxxxxx> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> --- arch/x86/kvm/vmx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c440df4a1604..d1981620c13a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11029,6 +11029,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr) bool block_nested_events = vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu); + vmx_complete_nested_posted_interrupt(vcpu); + if (vcpu->arch.exception.pending && nested_vmx_check_exception(vcpu, &exit_qual)) { if (block_nested_events) @@ -11069,7 +11071,6 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr) return 0; } - vmx_complete_nested_posted_interrupt(vcpu); return 0; } -- 1.9.1