Il 05/03/2013 10:37, Gleb Natapov ha scritto: > > > > Not at all. I'm keeping the state in a single place, mp_state. I just > > > > have to make sure that I do not loose asynchronous events - what INIT > > > > and SIPI are. > > > > > > As evident from this code: > > > + if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED || > > > + test_bit(KVM_REQ_INIT, &vcpu->requests)) { > > > the state is in two places. > > That's just to protect the content of sipi_vector during delivery. Is that even needed? Can we just do an unconditional: vcpu->arch.sipi_vector = vector; /* make sure sipi_vector is visible for the receiver */ smp_wmb(); kvm_make_request(KVM_REQ_SIPI, vcpu); kvm_vcpu_kick(vcpu); and check in the request handler that we did get an INIT? > > We could drop the complete if clause if we protected that variable differently. > > I understand why the code is here. I am saying that this is the evidence > that the state is in two places. I agree with Gleb here. The state is in two places. I'm not saying that using requests is wrong, it sounds nice especially for nested INIT. But there is something nasty in the patches so far. BTW checking the requests in kvm_apic_has_interrupt seems nicer than synchronizing in kvm_arch_cpu_runnable; it is a bit ugly to have side effects in kvm_arch_cpu_runnable. Then you can actually _process_ the requests only in vcpu_enter_guest and kvm_arch_vcpu_ioctl_get_mpstate(). In kvm_arch_vcpu_ioctl_put_mpstate(), KVM_MP_STATE_SIPI_RECEIVED has to become KVM_MP_STATE_INIT_RECEIVED with KVM_REQ_SIPI set. >>> > > >>>>> > >>> To overcome this we can either deprecated >>>>> > >>> KVM_MP_STATE_INIT_RECEIVED/KVM_MP_STATE_SIPI_RECEIVED values for mp_state >>>>> > >>> (use it only for migration purposes) and use separate state in APIC >>>>> > >>> to hold those event, like with nmi, or why not go with Paolo's simple >>>>> > >>> cmpxchg one? >>>> > >> >>>> > >> We need to replace most, if not all, manipulations of mp_state with >>>> > >> cmpxchg, verifying the state transitions there. Let's take again the matrix I sent yesterday, fixed to make UNINIT unreachable (that transition can only be done by userspace): from \ to RUNNABLE UNINIT INIT HALTED SIPI RUNNABLE n/a NO yes yes NO UNINIT NO n/a yes NO NO INIT NO NO yes NO yes HALTED yes NO yes n/a NO SIPI yes NO yes NO n/a We have: - anything -> INIT: this is asynchronous, and INIT always wins. No need for a cmpxchg. - RUNNABLE -> HALTED: cmpxchg needed - INIT -> SIPI: we can have a race between the last two interrupts in an INIT-INIT-SIPI (or INIT-SIPI-INIT) sequence, with these two interrupts sent by two different processors. The same race should also be present in real hardware, and should never happen (the BSP should send SIPIs to all other processors). No need for a cmpxchg. - HALTED -> RUNNABLE: racy vs. INIT too, cmpxchg needed - SIPI -> RUNNABLE: there is the same race with going back to INIT; no need for a cmpxchg. But it probably will not scale well with nested virt. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html