On 2012-03-28 18:30, Michael S. Tsirkin wrote: > On Wed, Mar 28, 2012 at 06:00:03PM +0200, Jan Kiszka wrote: >> On 2012-03-28 17:43, Michael S. Tsirkin wrote: >>> On Wed, Mar 28, 2012 at 01:36:15PM +0200, Jan Kiszka wrote: >>>> On 2012-03-28 13:31, Michael S. Tsirkin wrote: >>>>>>>>> Also, how would this support irqfd in the future? Will we have to >>>>>>>>> rip it all out and replace with per-device tracking that we >>>>>>>>> have today? >>>>>>>> >>>>>>>> Irqfd and kvm device assignment will require additional interfaces (of >>>>>>>> the kvm core in QEMU) via which you will be able to request stable >>>>>>>> routes from such sources to specified MSIs. That will be widely >>>>>>>> orthogonal to what is done in these patches here. >>>>>>> >>>>>>> Yes but not exactly as they will conflict for resources, right? >>>>>>> How do you plan to solve this? >>>>>> >>>>>> As done in my original series: If a static route requires a pseudo GSI >>>>>> and there are none free, we simply flush the dynamic MSI routes. >>>>> >>>>> Right. So static routes take precedence. This means that in effect >>>>> we will have two APIs in qemu: for fast MSIs and for slow ones, >>>>> the advantage of the slow APIs being that they are easier to use, >>>>> right? >>>> >>>> We will have two APIs depending on the source of the MSI. Special >>>> sources are the exception while emulated ones are the majority. And for >>>> the latter we should try very hard to keep things simple and clean. >>>> >>>> Jan >>> >>> I assume this means yes :) So how about we replace the hash table with a >>> single GSI reserved for this purpose, and use that for each interrupt? >>> This will work fine for slow paths such as hotplug controller, yes it >>> will be slow but *predictably* slow. >> >> AHCI, HDA, virtio-block, and every other userspace MSI user will suffer >> - I can't imagine you really want this. :) > > These should use static GSI routes or the new API if it exists. There will be an API to request an irqfd and associate it with a MSI message and the same for an assigned device IRQ/MSI vector. But none for userspace generated messages. That would mean hooking deep into the MSI layer again - or even the devices themselves. > Changing GSI routing when AHCI wants to send an interrupt > will cause performance trouble in unpredictable ways: > it triggers RCU write side and that can be *very* slow. That's why we will have direct MSI injection for them. This here is just to make it work without that feature in a reasonable, non-intrusive way. If it really hurts that much, we need to invest more in avoiding cache flushes. But I'm skeptical there is much to gain compared to the current qemu-kvm model: every vector change that results in a route change passes the RCU write side - and serializes other QEMU userspace exists. That _is_ already a bottleneck. Every MSI IRQ balancing between CPUs in the guest should trigger this e.g. What I would really like to avoid is introducing invasive abstractions and hooks to QEMU that optimize for a scenario that is obsolete mid to long term. > >>> >>> Fast path will use static GSIs like qemu-kvm does. >> >> Nope, qemu-kvm hooks deeply into the MSI layer to track vectors. I don't >> believe we want this upstream. It also doesn't work for non-PCI MSI >> (HPET on x86, try -global hpet.timers=4 -global hpet.msi=on with Linux >> guests). >> >> Jan > > Yes I understand you want an API on top of this, with > some structure to abstract the ideas away from PCI. > But the principle is that we'll track GSIs at the device > and keep the mapping to vectors static. Devices will track irqfd objects or similar abstractions for device assignment, not GSIs explicitly. Under the hood, there will be the GSI stored, of course. That will be simpler to apply than the current open-coded pattern. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html