Re: [PATCH] KVM: nVMX: clear nested_run_pending when emulating invalid guest state

Radim Krčmář <rkrcmar@xxxxxxxxxx> · Fri, 9 Mar 2018 20:59:07 +0100

2018-03-05 09:39-0800, Sean Christopherson:
> Clear nested_run_pending in handle_invalid_guest_state() after calling
> emulate_instruction(), i.e. after attempting to emulate at least one
> instruction.  This fixes an issue where L0 enters an infinite loop if
> L2 hits an exception that is intercepted by L1 while L0 is emulating
> L2's invalid guest state, effectively causing DoS on L1, e.g. the only
> way to break the loop is to kill Qemu in L0.
> 
>     1. call handle_invalid_guest_state() for L2
>     2. emulate_instruction() pends an exception, e.g. #UD
>     3. L1 intercepts the exception, i.e. nested_vmx_check_exception
>        returns 1
>     4. vmx_check_nested_events() returns -EBUSY because L1 wants to
>        intercept the exception and nested_run_pending is true
>     5. handle_invalid_guest_state() never makes forward progress for
>        L2 due to the pending exception
>     6. L1 retries VMLAUNCH and VMExits to L0 indefinitely, i.e. the
>        L1 vCPU trying VMLAUNCH effectively hangs
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> ---

nested_run_pending signals that we have to execute VMRESUME in order to
do injection from L2's VMCS (at least VM_ENTRY_INTR_INFO_FIELD).

If we don't let the hardware do it, we need to transfer the state from
L2's VMCS while doing a nested VM exit for the exception (= behave as if
we entered the guest and exited).

And I think the actual fix here is to evaluate the interrupt before the
first emulate_instruction() in handle_invalid_guest_state().

Do you want to look deeper into this?

Thanks.

> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 591214843046..3073160e6bae 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6835,6 +6835,8 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
>  
>  		err = emulate_instruction(vcpu, 0);
>  
> +		vmx->nested.nested_run_pending = 0;
> +
>  		if (err == EMULATE_USER_EXIT) {
>  			++vcpu->stat.mmio_exits;
>  			ret = 0;
> -- 
> 2.16.2
>