On 2012-10-26 12:44, Benjamin Herrenschmidt wrote: > On Fri, 2012-10-26 at 12:15 +0200, Paolo Bonzini wrote: >>> Whether you want to do startup configuration and board wiring via >>> the same ioctl that handles runtime state save/load/migration is >>> a different question, of course. >> >> QEMU's MSI-X routing is not x86-specific, so it should use the same >> KVM_SET_GSI_ROUTING ioctl that x86 uses. > > Well, that's the thing, I haven't managed to figure that out so far, it > looks very x86-specific to me. To begin with there's no such thing as a > "GSI" in our world. > > Basically we have a global interrupt number space. Interrupt numbers are > 24-bit long quantities. On real HW, some bits are called the "BUID" and > identify a given source controller and some bits are the interrupt > within that source controller but that's fairly flexible and generally > the OS doesn't care about it. The firmware sets up the mappings and > tells us the final numbers via the device-tree. > > Under a hypervisor, it's totally virtualized already so we show a flat > 24 bit number space to the guest. > > MSIs don't work exactly like x86 either. On real HW, we have a different > MSI port per "partitionable endpoint" which are use purely for > validation of access permission. The message itself contain the > interrupt source number within the BUID of the bridge. A given bridge > today can contains up to 256 of these on a P7IOC chip but upcoming stuff > can have thousands. The final interrupt number seen by the OS is thus > just that MSI message in the bottom bits and the BUID in the top bits. > > Here too, under a hypervisor, it's all virtualized so qemu just gives 24 > bit numbers to the various emulated MSIs as part of the global interrupt > number space. > > I'm not sure how any of that would need kernel communication. All we > need is to be able to associate a given global interrupt with an > eventfd. And at latest there you will need the IRQ routing infrastructure of KVM. It tells KVM which "virtual IRQ" (badly named "GSI") triggers which event at which input, e.g. a physical IRQ line at some IRQ controller or a specific message at some MSI receiver. You shouldn't try to invent a Power wheel here, rather tune the existing one to become more generic. We could even try to get rid of that unfortunate GSI name (when leaving aliases behind), though that is cosmetic. > > I might just miss some subtleties here but so far I haven't been able to > figure out how to "shoehorn" our stuff in the very x86-centric existing > interfaces to the kernel APICs. In fact all that code is in a generic > location in kvm but is really x86/ia64 centric and the interfaces seem > to be as well. That's not true in general, though you surely find a lot of traces and still a few concrete x86 bits under virt/kvm. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html