On Fri, May 31, 2013 at 11:48:10AM +0200, Paolo Bonzini wrote: > Il 31/05/2013 11:18, Gleb Natapov ha scritto: > > On Fri, May 31, 2013 at 10:48:32AM +0200, Paolo Bonzini wrote: > >> Il 31/05/2013 06:36, Gleb Natapov ha scritto: > >>> In my commit message there is two INITs in a row: > >>> vpu0: vcpu1: > >>> set INIT > >>> test_and_clear_bit(KVM_APIC_INIT) > >>> process INIT > >>> set INIT > >>> set SIPI > >>> test_and_clear_bit(KVM_APIC_SIPI) > >>> process SIPI > >>> > >>> Two INITs before SIPI are essential to trigger the bug > >> > >> I see now. Let's draw pending_events as well: > >> > >> event sent event processed pending_events > >> INIT INIT > >> INIT 0 > >> INIT INIT > >> SIPI INIT|SIPI > >> SIPI INIT > >> INIT 0 > >> > >> Events are reordered, there is indeed a bug if the second INIT comes at > >> just the right time. With your patch: > >> > >> event sent event processed pending_events > >> INIT INIT > >> INIT 0 > >> INIT INIT > >> SIPI INIT|SIPI > >> SIPI, failed cmpxchg INIT|SIPI > >> INIT SIPI > >> SIPI SIPI > > > > This is incorrect. cmpxchg will fail only if another INIT cames after SIPI. > > Why would it fail? > > You're right. > > Can you show what is the case in my patch where you have coalescing? I You'ev said it in some of your emails. Quoting: " INIT-INIT-SIPI-INIT-SIPI your version would do many SIPIs, while mine would do just one." > still prefer it because it is a smaller change, it keeps the "clear a > bit before processing" idea that you find almost everywhere. Changing > it to "clear a bit after processing" is a bigger and more surprising > change, though both are indeed tricky. > There is nothing "surprising" in it for me. Really it is so subjection that arguing about it is waste of everybody time and energy. So if we want to continue have fun arguing about it lets move to some real patch problems/benefits. So what I didn't like from the start about pending_events is that it introduces two locked instruction on each interrupt injection path, your patch makes it worse by change one of those locked instruction to cmpxchg, while mine actually removes one. But I think we can do even better and get rid of both of them for common case and do only one locked inst while there are events pending, but this is slow path so less important: diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 9d75193..3e0e85a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1850,11 +1850,14 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; unsigned int sipi_vector; + unsigned long pe; - if (!kvm_vcpu_has_lapic(vcpu)) + if (!kvm_vcpu_has_lapic(vcpu) || !apic->pending_events) return; - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) { + pe = xchg(&apic->pending_events, 0); + + if (test_bit(KVM_APIC_INIT, &pe)) { kvm_lapic_reset(vcpu); kvm_vcpu_reset(vcpu); if (kvm_vcpu_is_bsp(apic->vcpu)) @@ -1862,7 +1865,7 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu) else vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; } - if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) && + if (test_bit(KVM_APIC_SIPI, &pe) && vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { /* evaluate pending_events before reading the vector */ smp_rmb(); -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html