Re: [PATCH] KVM: nVMX: Fix direct injection of interrupts from L0 to L2

Joerg Roedel <joro@xxxxxxxxxx> · Tue, 19 Feb 2013 17:14:19 +0100

On Tue, Feb 19, 2013 at 11:04:01AM +0100, Jan Kiszka wrote:
> I had a look at SVM to check how it deals with this, but I'm not sure
> if I understand the logic correctly. SVM does:
> 
> static int nested_svm_vmexit(struct vcpu_svm *svm)
> {
> 	...
> 	/*
> 	 * If we emulate a VMRUN/#VMEXIT in the same host #vmexit cycle we have
> 	 * to make sure that we do not lose injected events. So check event_inj
> 	 * here and copy it to exit_int_info if it is valid.
> 	 * Exit_int_info and event_inj can't be both valid because the case
> 	 * below only happens on a VMRUN instruction intercept which has
> 	 * no valid exit_int_info set.
> 	 */
> 	if (vmcb->control.event_inj & SVM_EVTINJ_VALID) {
> 		struct vmcb_control_area *nc = &nested_vmcb->control;
> 
> 		nc->exit_int_info     = vmcb->control.event_inj;
> 		nc->exit_int_info_err = vmcb->control.event_inj_err;
> 	}
> 
> nested_svm_vmexit is only called when we leave L2 toward L1, right?

Right.

> So, vmcb->control.event_inj might have been set on last VMRUN emulation, and
> if that one failed, this value shall become the nested exit_int_info. So
> far, so good.

Important fact here: This L2->L1 exit is emulated in the same real
#vmexit cycle as the VMRUN was emulated. So what happens is:

	1. VMRUN intercept from L1
	2. We emulate the VMRUN and load L2 state into VMCB
	3. On the way back to guest mode (to actually run the L2) we
	   detect a #vmexit condition
	4. So we roll-back by calling nested_svm_vmexit()
	5. We enter the guest again which continues execution right
	   after its VMRUN instruction.

So we never actually entered L2, but for L1 it has to look like it was
in L2 and made no progress. But when coming out of a guest event_inj is
never valid, so without the special case above we make sure that the L1
hypervisor re-injects the event so it is not lost.

> But what if that injection succeeded and we are now exiting L2 past the
> execution of VMRUN, e.g. L1 intercepts the execution of some special
> instruction in L2? Doesn't the nested exit_int_info now gain a stale
> value? Or does the hardware clear the valid bit int EVENTINJ on
> successful injection? Didn't find an indication in the spec on first
> glance.

Hardware clears event_inj. If the injection was not successful the event
is reported in exit_int_info.

> Otherwise the logic seems to be like this:
>  - EVENTINJ is set to the nested value on VMRUN emulation, and only
>    there (that's in contrast to current VMX, but it makes sense)
>  - Interrupt completion with state transfer the VCPU event queues is
>    *only* performed on L2-to-L1 exits (that's like VMX is trying to do
>    it as well)
>  - There is a special case around nested.exit_required that I didn't
>    fully get yet, nor can I say how it corresponds to logic in VMX.

Which special case do you mean? There are checks in
nested_svm_check_exception() and nested_svm_intr().

Regards,

	Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html