Il 03/03/2013 17:48, Jan Kiszka ha scritto: > Hi all, > > KVM's mp_state on x86 is usually manipulated over the context of the > VCPU. Therefore, no locking is required. There are unfortunately two > exceptions, and one of them is definitely broken: INIT and SIPI delivery. > > The lapic may set mp_state over the context of the sending VCPU. For > SIPI, it first checks if the mp_state is INIT_RECEIVED before updating > it to SIPI_RECEIVED. We can only race here with user space setting the > state in parallel, I suppose. Probably harmless in practice. Still it would be better to add an smp_wmb/smp_rmb pair between accesses of mp_state and sipi_vector. Also, Io > What is critical is the update on INIT. That signal is asynchronous to > the target VCPU state. And we can loose it: > > vcpu 1 vcpu 2 > ------ ------ > hlt; > vmexit > __apic_accept_irq(APIC_DM_INIT) > mp_state = KVM_MP_STATE_INIT_RECEIVED > mp_state = KVM_MP_STATE_HALTED > > And there it goes, our INIT state. I've triggered this under heavy INIT > load and my nVMX patch for processing it while in VMXON. > > I'm currently considering options to fix this: > > - through a lock at mp_state manipulations, check under the lock that > we don't perform invalid state transitions (e.g. INIT->HLT) > - signal the INIT via some KVM_REQ_INIT to the target VCPU, fully > localizing mp_state updates, the same could be done with SIPI, just > to play safe > > I'm leaning toward the latter ATM, Any thoughts or other idea? The latter makes sense since it's not a fast path, but the only transition that is acceptable to KVM_MP_STATE_HALTED is from KVM_MP_STATE_RUNNABLE: from \ to RUNNABLE UNINIT INIT HALTED SIPI RUNNABLE n/a yes yes yes NO UNINIT NO n/a yes NO NO INIT NO yes n/a NO yes HALTED yes yes yes n/a NO SIPI yes yes yes NO n/a so for this particular bug it should also work to use a cmpxchg when setting KVM_MP_STATE_HALTED. Same for INIT->SIPI, since writes to sipi_vector are harmless. BTW, what happens when you send an INIT IPI to the bootstrap processor? This may be interesting if we want to emulate soft resets correctly in QEMU; KVM makes it go to wait-for-SIPI state if I read the code correctly, but that is wrong. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html