On Tue, Oct 18, 2011 at 05:22:38PM +0200, Jan Kiszka wrote: > On 2011-10-18 17:08, Michael S. Tsirkin wrote: > > On Tue, Oct 18, 2011 at 04:08:46PM +0200, Jan Kiszka wrote: > >> On 2011-10-18 16:01, Michael S. Tsirkin wrote: > >>>>>>>>> > >>>>>>>>> I actually would not mind preallocating everything upfront which is much > >>>>>>>>> easier. But with your patch we get a silent failure or a drastic > >>>>>>>>> slowdown which is much more painful IMO. > >>>>>>>> > >>>>>>>> Again: did we already saw that limit? And where does it come from if not > >>>>>>>> from KVM? > >>>>>>> > >>>>>>> It's a hardware limitation of intel APICs. interrupt vector is encoded > >>>>>>> in an 8 bit field in msi address. So you can have at most 256 of these. > >>>>>> > >>>>>> There should be no such limitation with pseudo GSIs we use for MSI > >>>>>> injection. They end up as MSI messages again, so actually 256 (-reserved > >>>>>> vectors) * number-of-cpus (on x86). > >>>>> > >>>>> This limits which CPUs can get the interrupt though. > >>>>> Linux seems to have a global pool as it wants to be able to freely > >>>>> balance vectors between CPUs. Or, consider a guest with a single CPU :) > >>>>> > >>>>> Anyway, why argue - there is a limitation, and it's not coming from KVM, > >>>>> right? > >>>> > >>>> No, our limit we hit with MSI message routing are first of all KVM GSIs, > >>>> and there only pseudo GSIs that do not go to any interrupt controller > >>>> with limited pins. > >>> > >>> I see KVM_MAX_IRQ_ROUTES 1024 > >>> This is > 256 so KVM does not seem to be the problem. > >> > >> We can generate way more different MSI messages than 256. A message may > >> encode the target CPU, so you have this number in the equation e.g. > > > > Yes but the vector is encoded in 256 bits. The rest is > > stuff like delivery mode, which won't affect which > > handler is run AFAIK. So while the problem might > > appear with vector sharing, in practice there is > > no vector sharing so no problem :) > > > >>> > >>>> That could easily be lifted in the kernel if we run > >>>> into shortages in practice. > >>> > >>> What I was saying is that resources are limited even without kvm. > >> > >> What other resources related to this particular case are exhausted > >> before GSI numbers? > >> > >> Jan > > > > distinct vectors > > The guest is responsible for managing vectors, not KVM, not QEMU. And > the guest will notice first when it runs out of them. So a virtio guest > driver may not even request MSI-X support if that happens. Absolutely. You can solve the problem from guest in theory. But what I was saying is, in practice what happens first X devices get msix, others don't. Guests aren't doing anything smart as they are not designed with a huge number of devices in mind. What we can do is solve the problem from management. And to do that we can't delay allocation until it's used. > What KVM has to do is just mapping an arbitrary MSI message > (theoretically 64+32 bits, in practice it's much of course much less) to > a single GSI and vice versa. As there are less GSIs than possible MSI > messages, we could run out of them when creating routes, statically or > lazily. Possible MSI messages != possible MSI vectors. If two devices share a vector, APIC won't be able to distinguish even though e.g. delivery mode is different. > What would probably help us long-term out of your concerns regarding > lazy routing is to bypass that redundant GSI translation for dynamic > messages, i.e. those that are not associated with an irqfd number or an > assigned device irq. Something like a KVM_DELIVER_MSI IOCTL that accepts > address and data directly. > > Jan You are trying to work around the problem by not requiring any resources per MSI vector. This just might work for some uses (ioctl) but isn't a generic solution (e.g. won't work for irqfd). > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html