Re: [PATCH v2 17/21] KVM: x86: Morph pending exceptions to pending VM-Exits at queue time

Sean Christopherson <seanjc@xxxxxxxxxx> · Mon, 11 Jul 2022 15:22:52 +0000

On Sun, Jul 10, 2022, Maxim Levitsky wrote:
> On Thu, 2022-07-07 at 01:24 +0000, Sean Christopherson wrote:
> > On Wed, Jul 06, 2022, Maxim Levitsky wrote:
> > > Other than that, this is a _very_ good idea to add it to KVM, although
> > > maybe we should put it in Documentation folder instead?
> > > (but I don't have a strong preference on this)
> > 
> > I definitely want a comment in KVM that's relatively close to the code.  I'm not
> > opposed to also adding something in Documentation, but I'd want that to be an "and"
> > not an "or".
> 
> Also makes sense. 
> 
> I do think that it is worthwhile to also add a comment about the way KVM
> handles exceptions, which means that inject_pending_event is not always called on instruction
> boundary. When we have a pending/injected exception we have first to get rid of it,
> and only then we will be on instruction boundary.

Yeah, though it's not like KVM has much of a choice, e.g. intercepted=>reflected
exceptions must be injected during instruction execution.  I wouldn't be opposed
to renaming inject_pending_event() if someone can come up with a decent alternative
that's sufficiently descriptive but not comically verbose.

kvm_check_events() to pair with kvm_check_nested_events()?  kvm_check_and_inject_events()?  

> And to be sure that we will inject pending interrupts on the closest instruction
> boundary, we actually open an interrupt/smi/nmi window there.
> > This is calling out something slightly different.  What it's saying is that if
> > there was a pending exception, then KVM should _not_ have injected said pending
> > exception and instead should have requested an immediate exit.  That "immediate
> > exit" should have forced a VM-Exit before the CPU could fetch a new instruction,
> > and thus before the guest could trigger an exception that would require reinjection.
> > 
> > The "immediate exit" trick works because all events with higher priority than the
> > VMX preeemption timer (or IRQ) are guaranteed to exit, e.g. a hardware SMI can't
> > cause a fault in the guest.
> 
> Yes it all makes sense now. It really helps thinking in terms of instruction boundary.
> 
> However, that makes me think: Can that actually happen?

I don't think KVM can get itself in that state, but I believe userspace could force
it by using KVM_SET_VCPU_EVENTS + KVM_SET_NESTED_STATE.

> A pending exception can only be generated by KVM itself (nested hypervisor,
> and CPU reflected exceptions/interrupts are all injected).
> 
> If VMRUN/VMRESUME has a pending exception, it means that it itself generated it,
> in which case we won't be entering the guest, but rather jump to the
> exception handler, and thus nested run will not be pending.

Notably, SVM handles single-step #DBs on VMRUN in the nested VM-Exit path.  That's
the only exception that I can think of off the top of my head that can be coincident
with a successful VM-Entry (ignoring things like NMI=>#PF).

> We can though have pending NMI/SMI/interrupts.
> 
> Also just a note about injected exceptions/interrupts during VMRUN/VMRESUME.
> 
> If nested_run_pending is true, then the injected exception due to the same
> reasoning can not come from VMRUN/VMRESUME. It can come from nested hypevisor's EVENTINJ,
> but in this case we currently just copy it from vmcb12/vmcs12 to vmcb02/vmcs02,
> without touching vcpu->arch.interrupt.
> 
> Luckily this doesn't cause issues because when the nested run is pending
> we don't inject anything to the guest.
> 
> If nested_run_pending is false however, the opposite is true. The EVENTINJ
> will be already delivered, and we can only have injected exception/interrupt
> that come from the cpu itself via exit_int_info/IDT_VECTORING_INFO_FIELD which
> we will copy back as injected interrupt/exception to 'vcpu->arch.exception/interrupt'.
> and later re-inject, next time we run the same VMRUN instruction.