Il 28/05/2013 14:56, Gleb Natapov ha scritto: >> > else >> > vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; >> > } >> > - if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) && >> > + /* >> > + * Note that we may get another INIT+SIPI sequence right here; process >> > + * the INIT first. Assumes that there are only KVM_APIC_INIT/SIPI. >> > + */ >> > + if (cmpxchg(&apic->pending_events, KVM_APIC_SIPI, 0) == KVM_APIC_SIPI && >> > vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { > Because pending_events can be INIT/SIPI at this point and it should be > interpreted as: do SIPI and ignore INIT (atomically). My patch does "do another INIT (which will have no effect) and do SIPI after that INIT", which is different but has almost the same effect. If pending_events is INIT/SIPI, it ignores the SIPI for now and lets the next iteration of kvm_apic_accept_events do both. The difference would be that in a carefully-timed sequence of interrupts INIT-INIT-SIPI-INIT-SIPI your version would do many SIPIs, while mine would do just one. Hmm... there is a reference to this in 25.2 "Other causes of VM exits": "If a logical processor is in the wait-for-SIPI state, INIT signals are blocked. They do not cause VM exits in this case." It is not for the physical processor, but it makes sense to have the same thing. Is this the reason why you did the cmpxchg at the end? But then, there's another way to mask INITs in the wait-for-SIPI state. Considering that KVM_MP_STATE_INIT_RECEIVED is really a wait-for-SIPI, you can do: diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index e1adbb4..36bc308 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -720,8 +720,12 @@ out: break; case APIC_DM_INIT: - if (!trig_mode || level) { + if ((!trig_mode || level) && + vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) { result = 1; + + /* check mp_state before writing apic->pending_events */ + smp_mb(); /* assumes that there are only KVM_APIC_INIT/SIPI */ apic->pending_events = (1UL << KVM_APIC_INIT); /* make sure pending_events is visible before sending @@ -1865,13 +1869,17 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu) if (!kvm_vcpu_has_lapic(vcpu)) return; - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) { + if (test_bit(KVM_APIC_INIT, &apic->pending_events)) { kvm_lapic_reset(vcpu); kvm_vcpu_reset(vcpu); if (kvm_vcpu_is_bsp(apic->vcpu)) vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; else vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; + + /* write mp_state before toggling KVM_APIC_INIT */ + smp_mb__before_clear_bit(); + clear_bit(KVM_APIC_INIT, &apic->pending_events); } if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) && vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { I don't have a particular preference. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html