Re: [PATCH 2/2] KVM: x86: Fix split-irqchip vs interrupt injection window request

Lai Jiangshan <jiangshanlai+lkml@xxxxxxxxx> · Thu, 15 Apr 2021 08:59:09 +0800

On Thu, Apr 15, 2021 at 12:58 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 14/04/21 04:28, Lai Jiangshan wrote:
> > On Tue, Apr 13, 2021 at 8:15 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
> >>
> >> On 13/04/21 13:03, Lai Jiangshan wrote:
> >>> This patch claims that it has a place to
> >>> stash the IRQ when EFLAGS.IF=0, but inject_pending_event() seams to ignore
> >>> EFLAGS.IF and queues the IRQ to the guest directly in the first branch
> >>> of using "kvm_x86_ops.set_irq(vcpu)".
> >>
> >> This is only true for pure-userspace irqchip.  For split-irqchip, in
> >> which case the "place to stash" the interrupt is
> >> vcpu->arch.pending_external_vector.
> >>
> >> For pure-userspace irqchip, KVM_INTERRUPT only cares about being able to
> >> stash the interrupt in vcpu->arch.interrupt.injected.  It is indeed
> >> wrong for userspace to call KVM_INTERRUPT if the vCPU is not ready for
> >> interrupt injection, but KVM_INTERRUPT does not return an error.
> >
> > Thanks for the reply.
> >
> > May I ask what is the correct/practical way of using KVM_INTERRUPT ABI
> > for pure-userspace irqchip.
> >
> > gVisor is indeed a pure-userspace irqchip, it will call KVM_INTERRUPT
> > when kvm_run->ready_for_interrupt_injection=1 (along with other conditions
> > unrelated to our discussion).
> >
> > https://github.com/google/gvisor/blob/a9441aea2780da8c93da1c73da860219f98438de/pkg/sentry/platform/kvm/bluepill_amd64_unsafe.go#L105
> >
> > if kvm_run->ready_for_interrupt_injection=1 when expection pending or
> > EFLAGS.IF=0, it would be unexpected for gVisor.
>
> Not with EFLAGS.IF=0.  For pending exception, there is code to handle it
> in inject_pending_event:
>

Thanks for the reply.
(I rearranged your summarization here)

> so what happens is:
>
> - the interrupt will not be injected before the exception
>
> - KVM will schedule an immediate vmexit to inject the interrupt as well
>
> - if (as is likely) the exception has turned off interrupts, the next
> call to inject_pending_event will reach
> static_call(kvm_x86_enable_irq_window) and the interrupt will only be
> injected when IF becomes 1 again.

The next call to inject_pending_event() will reach here AT FIRST with
vcpu->arch.exception.injected==false and vcpu->arch.exception.pending==false

>          ... if (!vcpu->arch.exception.pending) {
>                  if (vcpu->arch.nmi_injected) {
>                          static_call(kvm_x86_set_nmi)(vcpu);
>                          can_inject = false;
>                  } else if (vcpu->arch.interrupt.injected) {
>                          static_call(kvm_x86_set_irq)(vcpu);
>                          can_inject = false;

And comes here and vcpu->arch.interrupt.injected is true for there is
an interrupt queued by KVM_INTERRUPT for pure user irqchip. It then does
the injection of the interrupt without checking the EFLAGS.IF.

My question is that what stops the next call to inject_pending_event()
to reach here when KVM_INTERRUPT is called with exepction pending.

Or what makes kvm_run->ready_for_interrupt_injection be zero when
exception pending to disallow userspace to call KVM_INTERRUPT.

>                  }
>          }
>         ...
>          if (vcpu->arch.exception.pending) {
>                 ...
>                  can_inject = false;
>          }
>         // this is vcpu->arch.interrupt.injected for userspace LAPIC
>          if (kvm_cpu_has_injectable_intr(vcpu)) {
>                  r = can_inject ?
> static_call(kvm_x86_interrupt_allowed)(vcpu, true) : -EBUSY;
>                 if (r < 0)
>                         goto busy;
>                 ...
>         }
>
>
> Paolo
>