Re: [PATCH] KVM: nVMX: Morph notification vector IRQ on nested VM-Enter to pending PI

Jim Mattson <jmattson@xxxxxxxxxx> · Tue, 6 Oct 2020 10:36:09 -0700

On Wed, Aug 12, 2020 at 10:51 AM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On successful nested VM-Enter, check for pending interrupts and convert
> the highest priority interrupt to a pending posted interrupt if it
> matches L2's notification vector.  If the vCPU receives a notification
> interrupt before nested VM-Enter (assuming L1 disables IRQs before doing
> VM-Enter), the pending interrupt (for L1) should be recognized and
> processed as a posted interrupt when interrupts become unblocked after
> VM-Enter to L2.
>
> This fixes a bug where L1/L2 will get stuck in an infinite loop if L1 is
> trying to inject an interrupt into L2 by setting the appropriate bit in
> L2's PIR and sending a self-IPI prior to VM-Enter (as opposed to KVM's
> method of manually moving the vector from PIR->vIRR/RVI).  KVM will
> observe the IPI while the vCPU is in L1 context and so won't immediately
> morph it to a posted interrupt for L2.  The pending interrupt will be
> seen by vmx_check_nested_events(), cause KVM to force an immediate exit
> after nested VM-Enter, and eventually be reflected to L1 as a VM-Exit.
> After handling the VM-Exit, L1 will see that L2 has a pending interrupt
> in PIR, send another IPI, and repeat until L2 is killed.
>
> Note, posted interrupts require virtual interrupt deliveriy, and virtual
> interrupt delivery requires exit-on-interrupt, ergo interrupts will be
> unconditionally unmasked on VM-Enter if posted interrupts are enabled.
>
> Fixes: 705699a13994 ("KVM: nVMX: Enable nested posted interrupt processing")
> Cc: stable@xxxxxxxxxxxxxxx
> Cc: Liran Alon <liran.alon@xxxxxxxxxx>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> ---
I don't think this is the best fix.

I believe the real problem is the way that external and posted
interrupts are handled in vmx_check_nested_events().

First of all, I believe that the existing call to
vmx_complete_nested_posted_interrupt() at the end of
vmx_check_nested_events() is far too aggressive. Unless I am missing
something in the SDM, posted interrupt processing is *only* triggered
when the notification vector is received in VMX non-root mode. It is
not triggered on VM-entry.

Looking back one block, we have:

if (kvm_cpu_has_interrupt(vcpu) && !vmx_interrupt_blocked(vcpu)) {
    if (block_nested_events)
        return -EBUSY;
    if (!nested_exit_on_intr(vcpu))
        goto no_vmexit;
    nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
    return 0;
}

If nested_exit_on_intr() is true, we should first check to see if
"acknowledge interrupt on exit" is set. If so, we should acknowledge
the interrupt right here, with a call to kvm_cpu_get_interrupt(),
rather than deep in the guts of nested_vmx_vmexit(). If the vector we
get is the notification vector from VMCS12, then we should call
vmx_complete_nested_posted_interrupt(). Otherwise, we should call
nested_vmx_vmexit(EXIT_REASON_EXTERNAL_INTERRUPT) as we do now.

Furthermore, vmx_complete_nested_posted_interrupt() should write to
the L1 EOI register, as indicated in step 4 of the 7-step sequence
detailed in section 29.6 of the SDM, volume 3. It skips this step
today.