Re: [PATCH RFC] KVM: Fix race in apic->pending_events processing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 28, 2013 at 03:48:58PM +0200, Paolo Bonzini wrote:
> Il 28/05/2013 14:56, Gleb Natapov ha scritto:
> >> >  		else
> >> >  			vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
> >> >  	}
> >> > -	if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) &&
> >> > +	/*
> >> > +	 * Note that we may get another INIT+SIPI sequence right here; process
> >> > +	 * the INIT first.  Assumes that there are only KVM_APIC_INIT/SIPI.
> >> > +	 */
> >> > +	if (cmpxchg(&apic->pending_events, KVM_APIC_SIPI, 0) == KVM_APIC_SIPI &&
> >> >  	    vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
> > Because pending_events can be INIT/SIPI at this point and it should be
> > interpreted as: do SIPI and ignore INIT (atomically).
> 
> My patch does "do another INIT (which will have no effect) and do SIPI 
> after that INIT", which is different but has almost the same effect.  
> If pending_events is INIT/SIPI, it ignores the SIPI for now and lets 
> the next iteration of kvm_apic_accept_events do both.  The difference 
> would be that in a carefully-timed sequence of interrupts
> 
You assume that the next processing will actually happen, but this is
not necessary the case.

>     INIT-INIT-SIPI-INIT-SIPI
> 
> your version would do many SIPIs, while mine would do just one.
> 
> Hmm... there is a reference to this in 25.2 "Other causes of VM exits": 
> "If a logical processor is in the wait-for-SIPI state, INIT signals are 
> blocked.  They do not cause VM exits in this case."  It is not for the 
> physical processor, but it makes sense to have the same thing.  Is this 
> the reason why you did the cmpxchg at the end?
> 
No, but if helps us model proper behaviour for nested guest +1 to it :)
But until we handle INIT in nested virt it is not something that
dictates the solution.

> But then, there's another way to mask INITs in the wait-for-SIPI 
> state.  Considering that KVM_MP_STATE_INIT_RECEIVED is really a 
> wait-for-SIPI, you can do:
> 
Haven't checked it for races (especially races between multiple CPUS
sending INIT), but looks more complicated to me.

> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index e1adbb4..36bc308 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -720,8 +720,12 @@ out:
>  		break;
>  
>  	case APIC_DM_INIT:
> -		if (!trig_mode || level) {
> +		if ((!trig_mode || level) &&
> +		    vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) {
>  			result = 1;
> +
> +			/* check mp_state before writing apic->pending_events */
> +			smp_mb();
>  			/* assumes that there are only KVM_APIC_INIT/SIPI */
>  			apic->pending_events = (1UL << KVM_APIC_INIT);
>  			/* make sure pending_events is visible before sending
> @@ -1865,13 +1869,17 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
>  	if (!kvm_vcpu_has_lapic(vcpu))
>  		return;
>  
> -	if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) {
> +	if (test_bit(KVM_APIC_INIT, &apic->pending_events)) {
>  		kvm_lapic_reset(vcpu);
>  		kvm_vcpu_reset(vcpu);
>  		if (kvm_vcpu_is_bsp(apic->vcpu))
>  			vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>  		else
>  			vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
> +
> +		/* write mp_state before toggling KVM_APIC_INIT */
> +		smp_mb__before_clear_bit();
> +		clear_bit(KVM_APIC_INIT, &apic->pending_events);
>  	}
>  	if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) &&
>  	    vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
> 
> I don't have a particular preference.
> 
> Paolo

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux