On 2013-03-04 15:12, Paolo Bonzini wrote: > Il 03/03/2013 17:48, Jan Kiszka ha scritto: >> Hi all, >> >> KVM's mp_state on x86 is usually manipulated over the context of the >> VCPU. Therefore, no locking is required. There are unfortunately two >> exceptions, and one of them is definitely broken: INIT and SIPI delivery. >> >> The lapic may set mp_state over the context of the sending VCPU. For >> SIPI, it first checks if the mp_state is INIT_RECEIVED before updating >> it to SIPI_RECEIVED. We can only race here with user space setting the >> state in parallel, I suppose. Probably harmless in practice. > > Still it would be better to add an smp_wmb/smp_rmb pair between accesses > of mp_state and sipi_vector. Do we need a mb between sipi_vector assignment and kvm_make_request (see my patch to fix this issue)? > > Also, Io >> What is critical is the update on INIT. That signal is asynchronous to >> the target VCPU state. And we can loose it: >> >> vcpu 1 vcpu 2 >> ------ ------ >> hlt; >> vmexit >> __apic_accept_irq(APIC_DM_INIT) >> mp_state = KVM_MP_STATE_INIT_RECEIVED >> mp_state = KVM_MP_STATE_HALTED >> >> And there it goes, our INIT state. I've triggered this under heavy INIT >> load and my nVMX patch for processing it while in VMXON. >> >> I'm currently considering options to fix this: >> >> - through a lock at mp_state manipulations, check under the lock that >> we don't perform invalid state transitions (e.g. INIT->HLT) >> - signal the INIT via some KVM_REQ_INIT to the target VCPU, fully >> localizing mp_state updates, the same could be done with SIPI, just >> to play safe >> >> I'm leaning toward the latter ATM, Any thoughts or other idea? > > The latter makes sense since it's not a fast path, but the only > transition that is acceptable to KVM_MP_STATE_HALTED is from > KVM_MP_STATE_RUNNABLE: > > from \ to RUNNABLE UNINIT INIT HALTED SIPI > RUNNABLE n/a yes yes yes NO > UNINIT NO n/a yes NO NO > INIT NO yes n/a NO yes > HALTED yes yes yes n/a NO > SIPI yes yes yes NO n/a > > so for this particular bug it should also work to use a cmpxchg when > setting KVM_MP_STATE_HALTED. Same for INIT->SIPI, since writes to > sipi_vector are harmless. OK, but I already went for request bits. :) > > BTW, what happens when you send an INIT IPI to the bootstrap processor? > This may be interesting if we want to emulate soft resets correctly in > QEMU; KVM makes it go to wait-for-SIPI state if I read the code > correctly, but that is wrong. Where is this restriction specified? How do you reset the BP without resetting the while system then? Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html