Re: [syzbot] [kvm?] WARNING in vmx_handle_exit (2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 12, 2025, James Houghton wrote:
> On Wed, Feb 12, 2025 at 2:50 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 12, 2025, James Houghton wrote:
> > > Here's what I think is going on (with the C repro anyway):
> > >
> > > 1. KVM_RUN a nested VM, and eventually we end up with
> > >    nested_run_pending=1.
> > > 2. Exit KVM_RUN with EINTR (or any reason really, but I see EINTR in
> > >    repro attempts).
> > > 3. KVM_SET_REGS to set rflags to 0x1ac585, which has X86_EFLAGS_VM,
> > >    flipping it and setting vmx->emulation_required = true.
> > > 3. KVM_RUN again. vmx->emulation_required will stop KVM from clearing
> > >    nested_run_pending, and then we hit the
> > >    KVM_BUG_ON(nested_run_pending) in __vmx_handle_exit().
> > >
> > > So I guess the KVM_BUG_ON() is a little bit too conservative, but this
> > > is nonsensical VMM behavior. So I'm not really sure what the best
> > > solution is. Sean, any thoughts?
> >
> > Heh, deja vu.  This is essentially the same thing that was fixed by commit
> > fc4fad79fc3d ("KVM: VMX: Reject KVM_RUN if emulation is required with pending
> > exception"), just with a different WARN.
> >
> > This should fix it.  Checking nested_run_pending in handle_invalid_guest_state()
> > is overkill, but it can't possibly do any harm, and the weirdness can be addressed
> > with a comment.
> 
> Thanks Sean! This works, feel free to add:
> 
> Tested-by: James Houghton <jthoughton@xxxxxxxxxx>
> 
> I understand this fix as "KVM cannot emulate a nested vm-enter, so if
> emulation is required and we have a pending vm-enter, exit to userspace."
> (This doesn't seem overkill to me... perhaps this explanation is wrong.)

Sort of.  It's a horribly convoluted scenario that's exists only because early Intel
CPUs supported a half-baked version of VMX.

Emulation is "required" if and only if guest state is invalid, and VMRESUME/VMLAUNCH
VM-Fail (architecturally) if guest state is invalid.  Thus the only way for emulation
to be required when a nested VM-Enter is pending, i.e. after nested VMRESUME/VMLAUNCH
has succeeded but before KVM has entered L2 to complete emulation, is if KVM misses a
VM-Fail consistency check, or as is the case here, if userspace stuffs invalid state
while KVM is partway through VMRESUME/VMLAUNCH emulation.

Stuffing state from userspace doesn't put KVM in harm's way, but KVM can't emulate
the impossible state, and more importantly, it trips KVM's sanity check that detects
missed consistency checks.  The KVM_BUG_ON() could also be suppressed by moving the
nested_run_pending check below the emulation_required checks (see below), but that
would largely defeat the purpose of the sanity check.

Just out of sight in the below diff is related handling for the case where userspace,
or the guest itself via modifying SMRAM before RSM, stuffs bad state.  I.e. it's
the same scenario this syzkaller program hit, minus hitting the nested_run_pending=true
window.

		/*
		 * Synthesize a triple fault if L2 state is invalid.  In normal
		 * operation, nested VM-Enter rejects any attempt to enter L2
		 * with invalid state.  However, those checks are skipped if
		 * state is being stuffed via RSM or KVM_SET_NESTED_STATE.  If
		 * L2 state is invalid, it means either L1 modified SMRAM state
		 * or userspace provided bad state.  Synthesize TRIPLE_FAULT as
		 * doing so is architecturally allowed in the RSM case, and is
		 * the least awful solution for the userspace case without
		 * risking false positives.
		 */
		if (vmx->emulation_required) {
			nested_vmx_vmexit(vcpu, EXIT_REASON_TRIPLE_FAULT, 0, 0);
			return 1;
		}

The extra wrinkle in all of this is that emulation_required is only ever set if
the vCPU lacks Unrestricted Guest (URG).  All CPUs since Westmere support URG,
while KVM does allow disabling URG via module param, AFAIK syzbot doesn't run in
environments with enable_unrestricted_guest=0 (other people do run syzkaller in
such setups, but syzbot does not).

And so the only way guest state to be invalid (for emulation_required to be set),
is if L1 is running L2 with URG disabled.  I.e. KVM _could_ simply run L2, but
doing so would violate the VMX architecture from L1's perspective.

static inline bool vmx_guest_state_valid(struct kvm_vcpu *vcpu)
{
	return is_unrestricted_guest(vcpu) || __vmx_guest_state_valid(vcpu);
}

static inline bool is_unrestricted_guest(struct kvm_vcpu *vcpu)
{
	return enable_unrestricted_guest && (!is_guest_mode(vcpu) ||
	    (secondary_exec_controls_get(to_vmx(vcpu)) &
	    SECONDARY_EXEC_UNRESTRICTED_GUEST));
}

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f72835e85b6d..42bee8f2cffb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6492,15 +6492,6 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
        if (enable_pml && !is_guest_mode(vcpu))
                vmx_flush_pml_buffer(vcpu);
 
-       /*
-        * KVM should never reach this point with a pending nested VM-Enter.
-        * More specifically, short-circuiting VM-Entry to emulate L2 due to
-        * invalid guest state should never happen as that means KVM knowingly
-        * allowed a nested VM-Enter with an invalid vmcs12.  More below.
-        */
-       if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
-               return -EIO;
-
        if (is_guest_mode(vcpu)) {
                /*
                 * PML is never enabled when running L2, bail immediately if a
@@ -6538,10 +6529,16 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
                        return 1;
                }
 
+               if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
+                       return -EIO;
+
                if (nested_vmx_reflect_vmexit(vcpu))
                        return 1;
        }
 
+       if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
+               return -EIO;
+
        /* If guest state is invalid, start emulating.  L2 is handled above. */
        if (vmx->emulation_required)
                return handle_invalid_guest_state(vcpu);






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux