Re: [RFC][PATCH 28/45] qemu-kvm: msix: Drop tracking of used vectors

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Tue, 18 Oct 2011 16:01:56 +0200

On Tue, Oct 18, 2011 at 03:46:06PM +0200, Jan Kiszka wrote:
> On 2011-10-18 15:37, Michael S. Tsirkin wrote:
> > On Tue, Oct 18, 2011 at 03:00:29PM +0200, Jan Kiszka wrote:
> >> On 2011-10-18 14:48, Michael S. Tsirkin wrote:
> >>>> To my understanding, virtio will be the exception as no other device
> >>>> will have a chance to react on resource shortage while sending(!) an MSI
> >>>> message.
> >>>
> >>> Hmm, are you familiar with that spec?
> >>
> >> Not by heart.
> >>
> >>> This is not what virtio does,
> >>> resource shortage is detected during setup.
> >>> This is exactly the problem with lazy registration as you don't
> >>> allocate until it's too late.
> >>
> >> When is that setup phase? Does it actually come after every change to an
> >> MSI vector? I doubt so.
> > 
> > No. During setup, driver requests vectors from the OS, and then tells
> > the device which vector should each VQ use.  It then checks that the
> > assignment was successful. If not, it retries with less vectors.
> > 
> > Other devices can do this during initialization, and signal
> > resource availability to guest using msix vector number field.
> > 
> >> Thus virtio can only estimate the guest usage as
> >> well
> > 
> > At some level, this is fundamental: some guest operations
> > have no failure mode. So we must preallocate
> > some resources to make sure they won't fail.
> 
> We can still track the expected maximum number of active vectors at core
> level, collect them from the KVM layer, and warn if we expect conflicts.
> Anxious MSI users could then refrain from using this feature, others
> might be fine with risking a slow-down on conflicts.

It seems like a nice feature until you have to debug it in the field :).
If you really think it's worthwhile, let's add a 'force' flag so that
advanced users at least can declare that they know what they are doing.

> > 
> >> (a guest may or may not actually write a non-null data into a
> >> vector and unmask it).
> > 
> > Please, forget the non-NULL thing. virtio driver knows exactly
> > how many vectors we use and communicates this info to the device.
> > This is not uncommon at all.
> > 
> >>>
> >>>>>
> >>>>> I actually would not mind preallocating everything upfront which is much
> >>>>> easier.  But with your patch we get a silent failure or a drastic
> >>>>> slowdown which is much more painful IMO.
> >>>>
> >>>> Again: did we already saw that limit? And where does it come from if not
> >>>> from KVM?
> >>>
> >>> It's a hardware limitation of intel APICs. interrupt vector is encoded
> >>> in an 8 bit field in msi address. So you can have at most 256 of these.
> >>
> >> There should be no such limitation with pseudo GSIs we use for MSI
> >> injection. They end up as MSI messages again, so actually 256 (-reserved
> >> vectors) * number-of-cpus (on x86).
> > 
> > This limits which CPUs can get the interrupt though.
> > Linux seems to have a global pool as it wants to be able to freely
> > balance vectors between CPUs. Or, consider a guest with a single CPU :)
> > 
> > Anyway, why argue - there is a limitation, and it's not coming from KVM,
> > right?
> 
> No, our limit we hit with MSI message routing are first of all KVM GSIs,
> and there only pseudo GSIs that do not go to any interrupt controller
> with limited pins.

I see KVM_MAX_IRQ_ROUTES 1024
This is > 256 so KVM does not seem to be the problem.

> That could easily be lifted in the kernel if we run
> into shortages in practice.

What I was saying is that resources are limited even without kvm.

> > 
> >>>
> >>>>>
> >>>>>> That's also why we do those data == 0
> >>>>>> checks to skip used but unconfigured vectors.
> >>>>>>
> >>>>>> Jan
> >>>>>
> >>>>> These checks work more or less by luck BTW. It's
> >>>>> a hack which I hope lazy allocation will replace.
> >>>>
> >>>> The check is still valid (for x86) when we have to use static routes
> >>>> (device assignment, vhost).
> >>>
> >>> It's not valid at all - we are just lucky that linux and
> >>> windows guests seem to zero out the vector when it's not in use.
> >>> They do not have to do that.
> >>
> >> It is valid as it is just an optimization. If an unused vector has a
> >> non-null data field, we just redundantly register a route where we do
> >> not actually have to.
> > 
> > Well, the only reason we even have this code is because
> > it was claimed that some devices declare support for a huge number
> > of vectors which then go unused. So if the guest does not
> > do this we'll run out of vectors ...
> > 
> >> But we do need to be prepared
> > 
> > And ATM, we aren't, and probably can't be without kernel
> > changes, right?
> > 
> >> for potentially
> >> arriving messages on that virtual GSI, either via irqfd or kvm device
> >> assignment.
> >>
> >> Jan
> > 
> > Why irqfd?  Device assignment is ATM the only place where we use these
> > ugly hacks.
> 
> vfio will use irqfds. And that virtio is partly out of the picture is
> only because we know much more about virtio internals (specifically:
> "will not advertise more vectors than guests will want to use").
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html