Re: [RFC][PATCH 28/45] qemu-kvm: msix: Drop tracking of used vectors

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Tue, 18 Oct 2011 17:56:38 +0200

On Tue, Oct 18, 2011 at 05:22:38PM +0200, Jan Kiszka wrote:
> On 2011-10-18 17:08, Michael S. Tsirkin wrote:
> > On Tue, Oct 18, 2011 at 04:08:46PM +0200, Jan Kiszka wrote:
> >> On 2011-10-18 16:01, Michael S. Tsirkin wrote:
> >>>>>>>>>
> >>>>>>>>> I actually would not mind preallocating everything upfront which is much
> >>>>>>>>> easier.  But with your patch we get a silent failure or a drastic
> >>>>>>>>> slowdown which is much more painful IMO.
> >>>>>>>>
> >>>>>>>> Again: did we already saw that limit? And where does it come from if not
> >>>>>>>> from KVM?
> >>>>>>>
> >>>>>>> It's a hardware limitation of intel APICs. interrupt vector is encoded
> >>>>>>> in an 8 bit field in msi address. So you can have at most 256 of these.
> >>>>>>
> >>>>>> There should be no such limitation with pseudo GSIs we use for MSI
> >>>>>> injection. They end up as MSI messages again, so actually 256 (-reserved
> >>>>>> vectors) * number-of-cpus (on x86).
> >>>>>
> >>>>> This limits which CPUs can get the interrupt though.
> >>>>> Linux seems to have a global pool as it wants to be able to freely
> >>>>> balance vectors between CPUs. Or, consider a guest with a single CPU :)
> >>>>>
> >>>>> Anyway, why argue - there is a limitation, and it's not coming from KVM,
> >>>>> right?
> >>>>
> >>>> No, our limit we hit with MSI message routing are first of all KVM GSIs,
> >>>> and there only pseudo GSIs that do not go to any interrupt controller
> >>>> with limited pins.
> >>>
> >>> I see KVM_MAX_IRQ_ROUTES 1024
> >>> This is > 256 so KVM does not seem to be the problem.
> >>
> >> We can generate way more different MSI messages than 256. A message may
> >> encode the target CPU, so you have this number in the equation e.g.
> > 
> > Yes but the vector is encoded in 256 bits. The rest is
> > stuff like delivery mode, which won't affect which
> > handler is run AFAIK. So while the problem might
> > appear with vector sharing, in practice there is
> > no vector sharing so no problem :)
> > 
> >>>
> >>>> That could easily be lifted in the kernel if we run
> >>>> into shortages in practice.
> >>>
> >>> What I was saying is that resources are limited even without kvm.
> >>
> >> What other resources related to this particular case are exhausted
> >> before GSI numbers?
> >>
> >> Jan
> > 
> > distinct vectors
> 
> The guest is responsible for managing vectors, not KVM, not QEMU. And
> the guest will notice first when it runs out of them. So a virtio guest
> driver may not even request MSI-X support if that happens.

Absolutely. You can solve the problem from guest in theory.
But what I was saying is, in practice what happens first X
devices get msix, others don't. Guests aren't doing anything smart as
they are not designed with a huge number of devices in mind.

What we can do is solve the problem from management.
And to do that we can't delay allocation until it's used.

> What KVM has to do is just mapping an arbitrary MSI message
> (theoretically 64+32 bits, in practice it's much of course much less) to
> a single GSI and vice versa. As there are less GSIs than possible MSI
> messages, we could run out of them when creating routes, statically or
> lazily.

Possible MSI messages != possible MSI vectors.
If two devices share a vector, APIC won't be able
to distinguish even though e.g. delivery mode is
different.

> What would probably help us long-term out of your concerns regarding
> lazy routing is to bypass that redundant GSI translation for dynamic
> messages, i.e. those that are not associated with an irqfd number or an
> assigned device irq. Something like a KVM_DELIVER_MSI IOCTL that accepts
> address and data directly.
> 
> Jan

You are trying to work around the problem by not requiring
any resources per MSI vector. This just might work for some
uses (ioctl) but isn't a generic solution (e.g. won't work for irqfd).

> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html