Am 02.11.2010 19:52, Michael S. Tsirkin wrote: > On Tue, Nov 02, 2010 at 07:40:46PM +0100, Jan Kiszka wrote: >> Am 02.11.2010 19:24, Michael S. Tsirkin wrote: >>> On Tue, Nov 02, 2010 at 06:56:14PM +0100, Jan Kiszka wrote: >>>>>> dev->host_irq_disabled = false; >>>>>> } >>>>>> - spin_unlock(&dev->intx_lock); >>>>>> +out: >>>>>> + spin_unlock_irq(&dev->intx_lock); >>>>>> + >>>>>> + if (reassert) >>>>>> + kvm_set_irq(dev->kvm, dev->irq_source_id, dev->guest_irq, 1); >>>>> >>>>> Hmm, I think this still has more overhead than it needs to have. >>>>> Instead of setting level to 0 and then back to 1, can't we just >>>>> avoid set to 1 in the first place? This would need a different >>>>> interface than pci_2_3_irq_check_and_unmask to avoid a race >>>>> where interrupt is received while we are acking another one: >>>>> >>>>> block userspace access >>>>> check pending bit >>>>> if (!pending) >>>>> set irq (0) >>>>> clear pending >>>>> block userspace access >>>>> >>>>> Would be worth it for high volume devices. >>>> >>>> The problem is that we can't reorder guest IRQ line clearing and host >>>> IRQ line enabling without taking a lock across host IRQ disable + guest >>>> IRQ raise - and that is now distributed across hard and threaded IRQ >>>> handlers and we don't want to hold and IRQ-safe lock during kvm_set_irq. >>> >>> Oh I think I confused you. >>> What I mean is: >>> >>> block userspace access >>> check interrupt status bit >>> if (!interrupt status bit set) >>> set irq (0) >>> clear interrupt disable bit >>> block userspace access >>> >>> This way we enable interrupt after set irq so not need for >>> extra locks I think. >> >> OK. Would require some serious refactoring again. > > That would also mean we can't just solve the nested block/unblock > problem with a simple lock. Not sure this is worth the effort. Can't follow: what can be nested and how? > >> But what about edge IRQs? Don't we need to toggle the bit for them? And >> as we do not differentiate between level and edge, we currently have to >> do this unconditionally. > > AFAIK PCI IRQs are level, so I don't think we need to bother. Ah, indeed. > >>> >>> Hmm one thing I noticed is that pci_block_user_cfg_access >>> will BUG_ON if it was already blocked. So I think we have >>> a bug here when interrupt handler kicks in right after >>> we unmask interrupts. >>> >>> Probably need some kind of lock to protect against this. >>> >> >> Or an atomic counter. > > BTW block userspace access uses a global spinlock which will likely hurt > us on multi-CPU. Switching that to something more SMP friendly, e.g. a > per-device spinlock, might be a good idea: I don't see why that lock and > queue are global. Been through that code recently, hairy stuff. pci_lock also protects the bus operation which can be overloaded (e.g. for error injection - if that is not the only use case). So we need a per-bus lock, but that can still cause contentions if devices on the same bus are handled on different cpus. I think the whole PCI config interface is simply not designed for performance. It's considered a slow path, which it normally is. > >> Will have a look. > > Need to also consider an interrupt running in parallel > with unmasking in thread. > >> Alex, does VFIO take care of this already? > > VFIO does this all under spin_lock_irq. Everything has its pros and cons... Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html