Re: KVM: x86: Racy mp_state manipulations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Il 03/03/2013 17:48, Jan Kiszka ha scritto:
> Hi all,
> 
> KVM's mp_state on x86 is usually manipulated over the context of the
> VCPU. Therefore, no locking is required. There are unfortunately two
> exceptions, and one of them is definitely broken: INIT and SIPI delivery.
> 
> The lapic may set mp_state over the context of the sending VCPU. For
> SIPI, it first checks if the mp_state is INIT_RECEIVED before updating
> it to SIPI_RECEIVED. We can only race here with user space setting the
> state in parallel, I suppose. Probably harmless in practice.

Still it would be better to add an smp_wmb/smp_rmb pair between accesses
of mp_state and sipi_vector.

Also, Io
> What is critical is the update on INIT. That signal is asynchronous to
> the target VCPU state. And we can loose it:
> 
> vcpu 1				vcpu 2
> ------				------
> hlt;
> vmexit
> 				__apic_accept_irq(APIC_DM_INIT)
> 				mp_state = KVM_MP_STATE_INIT_RECEIVED
> mp_state = KVM_MP_STATE_HALTED
> 
> And there it goes, our INIT state. I've triggered this under heavy INIT
> load and my nVMX patch for processing it while in VMXON.
> 
> I'm currently considering options to fix this:
> 
> - through a lock at mp_state manipulations, check under the lock that
>   we don't perform invalid state transitions (e.g. INIT->HLT)
> - signal the INIT via some KVM_REQ_INIT to the target VCPU, fully
>   localizing mp_state updates, the same could be done with SIPI, just
>   to play safe
> 
> I'm leaning toward the latter ATM, Any thoughts or other idea?

The latter makes sense since it's not a fast path, but the only
transition that is acceptable to KVM_MP_STATE_HALTED is from
KVM_MP_STATE_RUNNABLE:

from \ to    RUNNABLE     UNINIT      INIT     HALTED       SIPI
RUNNABLE       n/a          yes        yes       yes         NO
UNINIT         NO           n/a        yes       NO          NO
INIT           NO           yes        n/a       NO          yes
HALTED         yes          yes        yes       n/a         NO
SIPI           yes          yes        yes       NO          n/a

so for this particular bug it should also work to use a cmpxchg when
setting KVM_MP_STATE_HALTED.  Same for INIT->SIPI, since writes to
sipi_vector are harmless.

BTW, what happens when you send an INIT IPI to the bootstrap processor?
 This may be interesting if we want to emulate soft resets correctly in
QEMU; KVM makes it go to wait-for-SIPI state if I read the code
correctly, but that is wrong.

Paolo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux