On Tue, Oct 18, 2011 at 02:38:36PM +0200, Jan Kiszka wrote: > On 2011-10-18 14:33, Michael S. Tsirkin wrote: > > On Tue, Oct 18, 2011 at 02:08:59PM +0200, Jan Kiszka wrote: > >> On 2011-10-18 13:58, Michael S. Tsirkin wrote: > >>> On Mon, Oct 17, 2011 at 09:28:12PM +0200, Jan Kiszka wrote: > >>>> On 2011-10-17 17:48, Michael S. Tsirkin wrote: > >>>>> On Mon, Oct 17, 2011 at 11:28:02AM +0200, Jan Kiszka wrote: > >>>>>> This optimization was only required to keep KVM route usage low. Now > >>>>>> that we solve that problem via lazy updates, we can drop the field. We > >>>>>> still need interfaces to clear pending vectors, though (and we have to > >>>>>> make use of them more broadly - but that's unrelated to this patch). > >>>>>> > >>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> > >>>>> > >>>>> Lazy updates should be an implementation detail. > >>>>> IMO resource tracking of vectors makes sense > >>>>> as an API. Making devices deal with pending > >>>>> vectors as a concept, IMO, does not. > >>>> > >>>> There is really no use for tracking the vector lifecycle once we have > >>>> lazy updates (except for static routes). It's a way too invasive > >>>> concept, and it's not needed for anything but KVM. > >>> > >>> I think it's needed. The PCI spec states that when the device > >>> does not need an interrupt anymore, it should clear the pending > >>> bit. > >> > >> That should be done explicitly if it is required outside existing > >> clearing points. We already have that service, it's called > >> msix_clear_vector. > > > > We do? I don't seem to see it upstream... > > True. From the device's POV, MSI-X (and also MSI!) vectors are actually > level-triggered. This definitely takes adjusting to. > So we should communicate the level to the MSI core and > not just the edge. Needs more fixing > > > > >> That alone does not justify msix_vector_use and all > >> the state and logic behind it IMHO. > > > > To me it looks like an abstraction that solves both > > this problem and the resource allocation problem. > > Resources are actually limited BTW, this is not just > > a KVM thing. qemu.git currently lets guests decide > > what to do with them, but it might turn out to > > be benefitial to warn the management application > > that it is shooting itself in the foot. > > > >>> The use/unuse is IMO a decent API for this, > >>> because it uses a familiar resource tracking concept. > >>> Exposing this knowledge of msix to devices seems > >>> like a worse API. > >>> > >>>> > >>>> If you want an example, check > >>>> http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/70915 and > >>>> compare it to the changes done to hpet in this series. > >>>> > >>>> Jan > >>>> > >>> > >>> This seems to be a general argument that lazy updates are good? > >>> I have no real problem with them, besides the fact that > >>> we need an API to reserve space in the routing > >>> table so that device setup can fail upfront. > >> > >> That's not possible, even with used vectors, as devices change their > >> vector usage depending on how the guest configures the devices. If you > >> (pre-)allocate all possible vectors, you may run out of resources > >> earlier than needed actually. > > > > This really depends, but please do look at how with virtio > > we report resource shortage to guest and let it fall back to > > level interrups. You seem to remove that capability. > > To my understanding, virtio will be the exception as no other device > will have a chance to react on resource shortage while sending(!) an MSI > message. Hmm, are you familiar with that spec? This is not what virtio does, resource shortage is detected during setup. This is exactly the problem with lazy registration as you don't allocate until it's too late. > > > > I actually would not mind preallocating everything upfront which is much > > easier. But with your patch we get a silent failure or a drastic > > slowdown which is much more painful IMO. > > Again: did we already saw that limit? And where does it come from if not > from KVM? It's a hardware limitation of intel APICs. interrupt vector is encoded in an 8 bit field in msi address. So you can have at most 256 of these. > > > >> That's also why we do those data == 0 > >> checks to skip used but unconfigured vectors. > >> > >> Jan > > > > These checks work more or less by luck BTW. It's > > a hack which I hope lazy allocation will replace. > > The check is still valid (for x86) when we have to use static routes > (device assignment, vhost). It's not valid at all - we are just lucky that linux and windows guests seem to zero out the vector when it's not in use. They do not have to do that. > For lazy updates, it's obsolete, that's true. > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html