Re: [PATCH 6/8 v2] Move IO APIC to its own lock.

Gleb Natapov <gleb@xxxxxxxxxx> · Wed, 12 Aug 2009 12:47:08 +0300



On Wed, Aug 12, 2009 at 12:18:10PM +0300, Avi Kivity wrote:
> On 08/12/2009 12:04 PM, Gleb Natapov wrote:
>> On Wed, Aug 12, 2009 at 11:27:13AM +0300, Avi Kivity wrote:
>>    
>>> On 08/11/2009 03:31 PM, Gleb Natapov wrote:
>>>
>>>
>>> What is the motivation for this change?
>>>
>>>      
>> The motivation was explained in 0/0. I want to get rid of lock on
>> general irq injection path so the lock have to be pushed into ioapic
>> since multiple cpus can access it concurrently. PIC has such lock
>> already.
>>    
>
> Ah, the real motivation is msi.  Pushing locks down doesn't help if we  
> keep locking them.  But for msi we avoid the lock entirely.
>
Yes. MSI is one. Multiple IOAPICs may inject interrupt in parallel too
(if we will choose to implement multiple IOAPICs sometime).

>>> Why a spinlock and not a mutex?
>>>
>>>      
>> Protected sections are small and we do not sleep there.
>>    
>
> So what?  A mutex is better since it allows preemption (and still has  
> spinlock performance if it isn't preempted).
>
This lock will be taken during irq injection from irqfd, so may be leave
it as spinlock and take it _irqsave()? Do we want to allow irq injection
from interrupt context? Otherwise if you say that performance is the
same I don't care one way or the other.

>
>
>>> Need to explain why this is safe.  I'm not sure it is, because we touch
>>> state afterwards in pic_intack().  We need to do all vcpu-synchronous
>>> operations before dropping the lock.
>>>      
>> Forst pic_intack() calls pic_clear_isr() only in auto eoi mode and this mode
>> is already broken for assigned devices. Second for level triggered
>> interrupts pic_intack() does nothing after calling pic_clear_isr() and
>> third I can move pic_clear_isr() call to the end of pic_intack().
>>    
>
> I meant, in a comment.
I you agree with above I'll add it as a comment.

>
>>>>     void kvm_pic_clear_isr_ack(struct kvm *kvm)
>>>> @@ -238,7 +240,9 @@ void kvm_pic_reset(struct kvm_kpic_state *s)
>>>>    		if (vcpu0&&   kvm_apic_accept_pic_intr(vcpu0))
>>>>    			if (s->irr&   (1<<   irq) || s->isr&   (1<<   irq)) {
>>>>    				n = irq + irqbase;
>>>> +				spin_unlock(&s->pics_state->lock);
>>>>    				kvm_notify_acked_irq(kvm, SELECT_PIC(n), n);
>>>> +				spin_lock(&s->pics_state->lock);
>>>>
>>>>        
>>> Ditto here, needs to be moved until after done changing state.
>>>
>>>      
>> I am not sure this code is even needed. IOAPIC don't call notifiers on
>> reset.
>>    
>
> It should.  What if there's a reset with an assigned device?  We need to  
> release the device interrupt (after doing FLR?).
Doing this will just re-enable host interrupt while irq condition is not
cleared in the device. The host will hang. So I think we really shouldn't.

>
>>    
>>>> -static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int pin,
>>>> -				    int trigger_mode)
>>>> +static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
>>>> +				     int trigger_mode)
>>>>    {
>>>> -	union kvm_ioapic_redirect_entry *ent;
>>>> +	int i;
>>>> +
>>>> +	for (i = 0; i<   IOAPIC_NUM_PINS; i++) {
>>>> +		union kvm_ioapic_redirect_entry *ent =&ioapic->redirtbl[i];
>>>> +
>>>> +		if (ent->fields.vector != vector)
>>>> +			continue;
>>>>
>>>> -	ent =&ioapic->redirtbl[pin];
>>>> +		spin_unlock(&ioapic->lock);
>>>> +		kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i);
>>>> +		spin_lock(&ioapic->lock);
>>>>
>>>>
>>>>        
>>> I *think* we need to clear remote_irr before dropping the lock.  I
>>> *know* there's a missing comment here.
>>>      
>> I don't see why we clear remote_irr before dropping the lock. If, while
>> lock was dropped, interrupt was delivered to this entry it will be
>> injected when ack notifier returns.
>>    
>
> But we'll clear remote_irr afterward the redelivery, and we should to  
> that only after the new interrupt is acked.
It depend on whether you consider calling ack notifiers a part of
interrupt acknowledgement process. If you do then remote_irr should not
be cleared before ack notifiers since ack process is not completed yet.
With current users functionally it shouldn't matter when we clear
remote_irr. I prefer doing it like we do it now since this how it was
before my patches and since code is simpler this way.

>
>>>> -	kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, pin);
>>>> +		if (trigger_mode != IOAPIC_LEVEL_TRIG)
>>>> +			continue;
>>>>
>>>> -	if (trigger_mode == IOAPIC_LEVEL_TRIG) {
>>>>    		ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG);
>>>>    		ent->fields.remote_irr = 0;
>>>> -		if (!ent->fields.mask&&   (ioapic->irr&   (1<<   pin)))
>>>> -			ioapic_service(ioapic, pin);
>>>> +		if (!ent->fields.mask&&   (ioapic->irr&   (1<<   i)))
>>>> +			ioapic_service(ioapic, i);
>>>>    	}
>>>>    }
>>>>
>>>>        
>>> To make the patch easier to read, suggest keeping the loop in the other
>>> function.
>>>
>>>      
>> I don't follow. All __kvm_ioapic_update_eoi() contains is the loop, so
>> the loop is already in its own function. Do you mean move the context of
>> the loop into the other function and leave only for(;;) fun(); in
>> __kvm_ioapic_update_eoi()?
>>    
>
> No, I mean keep the for loop in kvm_ioapic_update_eoi().
>
Can't do that. __kvm_ioapic_update_eoi() is called from other place with
lock held already.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html