Re: [PATCH RFC] KVM: Fix race in apic->pending_events processing

Gleb Natapov <gleb@xxxxxxxxxx> · Thu, 30 May 2013 15:34:54 +0300



On Thu, May 30, 2013 at 09:30:41AM +0200, Paolo Bonzini wrote:
> Il 30/05/2013 09:09, Gleb Natapov ha scritto:
> > On Thu, May 30, 2013 at 08:31:11AM +0200, Paolo Bonzini wrote:
> >> Il 30/05/2013 08:01, Gleb Natapov ha scritto:
> >>> On Thu, May 30, 2013 at 07:41:05AM +0200, Paolo Bonzini wrote:
> >>>> Il 30/05/2013 03:20, Gleb Natapov ha scritto:
> >>>>> On Tue, May 28, 2013 at 06:33:39PM +0200, Paolo Bonzini wrote:
> >>>>>> Il 28/05/2013 17:00, Gleb Natapov ha scritto:
> >>>>>>> On Tue, May 28, 2013 at 03:48:58PM +0200, Paolo Bonzini wrote:
> >>>>>>>> Il 28/05/2013 14:56, Gleb Natapov ha scritto:
> >>>>>>>>>>>  		else
> >>>>>>>>>>>  			vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
> >>>>>>>>>>>  	}
> >>>>>>>>>>> -	if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) &&
> >>>>>>>>>>> +	/*
> >>>>>>>>>>> +	 * Note that we may get another INIT+SIPI sequence right here; process
> >>>>>>>>>>> +	 * the INIT first.  Assumes that there are only KVM_APIC_INIT/SIPI.
> >>>>>>>>>>> +	 */
> >>>>>>>>>>> +	if (cmpxchg(&apic->pending_events, KVM_APIC_SIPI, 0) == KVM_APIC_SIPI &&
> >>>>>>>>>>>  	    vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
> >>>>>>>>> Because pending_events can be INIT/SIPI at this point and it should be
> >>>>>>>>> interpreted as: do SIPI and ignore INIT (atomically).
> >>>>>>>>
> >>>>>>>> My patch does "do another INIT (which will have no effect) and do SIPI 
> >>>>>>>> after that INIT", which is different but has almost the same effect.  
> >>>>>>>> If pending_events is INIT/SIPI, it ignores the SIPI for now and lets 
> >>>>>>>> the next iteration of kvm_apic_accept_events do both.  The difference 
> >>>>>>>> would be that in a carefully-timed sequence of interrupts
> >>>>>>>>
> >>>>>>> You assume that the next processing will actually happen, but this is
> >>>>>>> not necessary the case.
> >>>>>>
> >>>>>> Why not?  The INIT and SIPI that have just been sent have kicked the
> >>>>>> VCPU again.
> >>>>>
> >>>>> kick is a nop if vcpu thread is not in a halt or in a guest.
> >>>>
> >>>> But the KVM_REQ_EVENT request will be caught at:
> >>>>
> >>>>         if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
> >>>>             || need_resched() || signal_pending(current)) {
> >>>>                 vcpu->mode = OUTSIDE_GUEST_MODE;
> >>>>                 smp_wmb();
> >>>>                 local_irq_enable();
> >>>>                 preempt_enable();
> >>>>                 r = 1;
> >>>>                 goto cancel_injection;
> >>>>         }
> >>>>
> >>>> and the entry will be canceled.
> >>
> >> I was wrong: we exit immediately because state is
> >> KVM_MP_STATE_INIT_RECEIVED.  But then...
> >>
> >>> But vcpu may be in non running state so we will not get here.
> >>
> >> ... vcpu_enter_guest will return 1 and __vcpu_run goes around the while
> >> loop once more (modulo pending signals of course).
> >>
> >> On the next iteration __vcpu_run will call kvm_vcpu_block, which calls
> >> kvm_arch_vcpu_runnable.  kvm_arch_vcpu_runnable returns true because
> >> kvm_apic_has_events(vcpu) is also true.  This will set KVM_REQ_UNHALT,
> >> call kvm_apic_accept_events again and do the INIT+SIPI.
> >
> > Ah, we check kvm_apic_has_events() in runnable. Then yes, we will not
> > lose the event.
> 
> Ok, then I'd prefer to have the cmpxchg directly in the if, as in
> http://article.gmane.org/gmane.comp.emulators.kvm.devel/110505
> 
I still do not. Both of them are tricky, mine does not coalesce events
needlessly.

> Thanks for the discussion!
> 
> Paolo

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html