On Wed, Aug 12, 2009 at 12:18:10PM +0300, Avi Kivity wrote: > On 08/12/2009 12:04 PM, Gleb Natapov wrote: >> On Wed, Aug 12, 2009 at 11:27:13AM +0300, Avi Kivity wrote: >> >>> On 08/11/2009 03:31 PM, Gleb Natapov wrote: >>> >>> >>> What is the motivation for this change? >>> >>> >> The motivation was explained in 0/0. I want to get rid of lock on >> general irq injection path so the lock have to be pushed into ioapic >> since multiple cpus can access it concurrently. PIC has such lock >> already. >> > > Ah, the real motivation is msi. Pushing locks down doesn't help if we > keep locking them. But for msi we avoid the lock entirely. > Yes. MSI is one. Multiple IOAPICs may inject interrupt in parallel too (if we will choose to implement multiple IOAPICs sometime). >>> Why a spinlock and not a mutex? >>> >>> >> Protected sections are small and we do not sleep there. >> > > So what? A mutex is better since it allows preemption (and still has > spinlock performance if it isn't preempted). > This lock will be taken during irq injection from irqfd, so may be leave it as spinlock and take it _irqsave()? Do we want to allow irq injection from interrupt context? Otherwise if you say that performance is the same I don't care one way or the other. > > >>> Need to explain why this is safe. I'm not sure it is, because we touch >>> state afterwards in pic_intack(). We need to do all vcpu-synchronous >>> operations before dropping the lock. >>> >> Forst pic_intack() calls pic_clear_isr() only in auto eoi mode and this mode >> is already broken for assigned devices. Second for level triggered >> interrupts pic_intack() does nothing after calling pic_clear_isr() and >> third I can move pic_clear_isr() call to the end of pic_intack(). >> > > I meant, in a comment. I you agree with above I'll add it as a comment. > >>>> void kvm_pic_clear_isr_ack(struct kvm *kvm) >>>> @@ -238,7 +240,9 @@ void kvm_pic_reset(struct kvm_kpic_state *s) >>>> if (vcpu0&& kvm_apic_accept_pic_intr(vcpu0)) >>>> if (s->irr& (1<< irq) || s->isr& (1<< irq)) { >>>> n = irq + irqbase; >>>> + spin_unlock(&s->pics_state->lock); >>>> kvm_notify_acked_irq(kvm, SELECT_PIC(n), n); >>>> + spin_lock(&s->pics_state->lock); >>>> >>>> >>> Ditto here, needs to be moved until after done changing state. >>> >>> >> I am not sure this code is even needed. IOAPIC don't call notifiers on >> reset. >> > > It should. What if there's a reset with an assigned device? We need to > release the device interrupt (after doing FLR?). Doing this will just re-enable host interrupt while irq condition is not cleared in the device. The host will hang. So I think we really shouldn't. > >> >>>> -static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int pin, >>>> - int trigger_mode) >>>> +static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector, >>>> + int trigger_mode) >>>> { >>>> - union kvm_ioapic_redirect_entry *ent; >>>> + int i; >>>> + >>>> + for (i = 0; i< IOAPIC_NUM_PINS; i++) { >>>> + union kvm_ioapic_redirect_entry *ent =&ioapic->redirtbl[i]; >>>> + >>>> + if (ent->fields.vector != vector) >>>> + continue; >>>> >>>> - ent =&ioapic->redirtbl[pin]; >>>> + spin_unlock(&ioapic->lock); >>>> + kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i); >>>> + spin_lock(&ioapic->lock); >>>> >>>> >>>> >>> I *think* we need to clear remote_irr before dropping the lock. I >>> *know* there's a missing comment here. >>> >> I don't see why we clear remote_irr before dropping the lock. If, while >> lock was dropped, interrupt was delivered to this entry it will be >> injected when ack notifier returns. >> > > But we'll clear remote_irr afterward the redelivery, and we should to > that only after the new interrupt is acked. It depend on whether you consider calling ack notifiers a part of interrupt acknowledgement process. If you do then remote_irr should not be cleared before ack notifiers since ack process is not completed yet. With current users functionally it shouldn't matter when we clear remote_irr. I prefer doing it like we do it now since this how it was before my patches and since code is simpler this way. > >>>> - kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, pin); >>>> + if (trigger_mode != IOAPIC_LEVEL_TRIG) >>>> + continue; >>>> >>>> - if (trigger_mode == IOAPIC_LEVEL_TRIG) { >>>> ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG); >>>> ent->fields.remote_irr = 0; >>>> - if (!ent->fields.mask&& (ioapic->irr& (1<< pin))) >>>> - ioapic_service(ioapic, pin); >>>> + if (!ent->fields.mask&& (ioapic->irr& (1<< i))) >>>> + ioapic_service(ioapic, i); >>>> } >>>> } >>>> >>>> >>> To make the patch easier to read, suggest keeping the loop in the other >>> function. >>> >>> >> I don't follow. All __kvm_ioapic_update_eoi() contains is the loop, so >> the loop is already in its own function. Do you mean move the context of >> the loop into the other function and leave only for(;;) fun(); in >> __kvm_ioapic_update_eoi()? >> > > No, I mean keep the for loop in kvm_ioapic_update_eoi(). > Can't do that. __kvm_ioapic_update_eoi() is called from other place with lock held already. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html