Re: [PATCH v2 4/4] KVM: Allow host IRQ sharing for passed-through PCI 2.3 devices

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Tue, 2 Nov 2010 21:14:30 +0200

On Tue, Nov 02, 2010 at 08:11:31PM +0100, Jan Kiszka wrote:
> Am 02.11.2010 19:52, Michael S. Tsirkin wrote:
> > On Tue, Nov 02, 2010 at 07:40:46PM +0100, Jan Kiszka wrote:
> >> Am 02.11.2010 19:24, Michael S. Tsirkin wrote:
> >>> On Tue, Nov 02, 2010 at 06:56:14PM +0100, Jan Kiszka wrote:
> >>>>>>  		dev->host_irq_disabled = false;
> >>>>>>  	}
> >>>>>> -	spin_unlock(&dev->intx_lock);
> >>>>>> +out:
> >>>>>> +	spin_unlock_irq(&dev->intx_lock);
> >>>>>> +
> >>>>>> +	if (reassert)
> >>>>>> +		kvm_set_irq(dev->kvm, dev->irq_source_id, dev->guest_irq, 1);
> >>>>>
> >>>>> Hmm, I think this still has more overhead than it needs to have.
> >>>>> Instead of setting level to 0 and then back to 1, can't we just
> >>>>> avoid set to 1 in the first place? This would need a different
> >>>>> interface than pci_2_3_irq_check_and_unmask to avoid a race
> >>>>> where interrupt is received while we are acking another one:
> >>>>>
> >>>>> 	block userspace access
> >>>>> 	check pending bit
> >>>>> 	if (!pending)
> >>>>> 		set irq (0)
> >>>>> 	clear pending
> >>>>> 	block userspace access
> >>>>>
> >>>>> Would be worth it for high volume devices.
> >>>>
> >>>> The problem is that we can't reorder guest IRQ line clearing and host
> >>>> IRQ line enabling without taking a lock across host IRQ disable + guest
> >>>> IRQ raise - and that is now distributed across hard and threaded IRQ
> >>>> handlers and we don't want to hold and IRQ-safe lock during kvm_set_irq.
> >>>
> >>> Oh I think I confused you.
> >>> What I mean is:
> >>>
> >>>  	block userspace access
> >>>  	check interrupt status bit
> >>>  	if (!interrupt status bit set)
> >>>  		set irq (0)
> >>>  	clear interrupt disable bit
> >>>  	block userspace access
> >>>
> >>> This way we enable interrupt after set irq so not need for
> >>> extra locks I think.
> >>
> >> OK. Would require some serious refactoring again.
> > 
> > That would also mean we can't just solve the nested block/unblock
> > problem with a simple lock.  Not sure this is worth the effort.
> 
> Can't follow: what can be nested and how?

I just mean interrupt trying to block userspace when thread
did that already.

> > 
> >> But what about edge IRQs? Don't we need to toggle the bit for them? And
> >> as we do not differentiate between level and edge, we currently have to
> >> do this unconditionally.
> > 
> > AFAIK PCI IRQs are level, so I don't think we need to bother.
> 
> Ah, indeed.
> 
> > 
> >>>
> >>> Hmm one thing I noticed is that pci_block_user_cfg_access
> >>> will BUG_ON if it was already blocked. So I think we have
> >>> a bug here when interrupt handler kicks in right after
> >>> we unmask interrupts.
> >>>
> >>> Probably need some kind of lock to protect against this.
> >>>
> >>
> >> Or an atomic counter.
> > 
> > BTW block userspace access uses a global spinlock which will likely hurt
> > us on multi-CPU. Switching that to something more SMP friendly, e.g. a
> > per-device spinlock, might be a good idea: I don't see why that lock and
> > queue are global.
> 
> Been through that code recently, hairy stuff. pci_lock also protects the
> bus operation which can be overloaded (e.g. for error injection - if
> that is not the only use case). So we need a per-bus lock, but that can
> still cause contentions if devices on the same bus are handled on
> different cpus.

Right. So this lock got reused for blocking userspace, I do not suggest
we rip it all out, just make userspace blocking use
a finer-grained lock.

> I think the whole PCI config interface is simply not designed for
> performance. It's considered a slow path, which it normally is.

So I guess we'll need to fix that now, at least if we
want to make the 2.3 way the default.

> > 
> >> Will have a look.
> > 
> > Need to also consider an interrupt running in parallel
> > with unmasking in thread.
> > 
> >> Alex, does VFIO take care of this already?
> > 
> > VFIO does this all under spin_lock_irq.
> 
> Everything has its pros and cons...
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html