On Mon, Sep 23, 2013 at 09:34:01PM +0300, Gleb Natapov wrote: > On Mon, Sep 23, 2013 at 09:24:14PM +1000, Paul Mackerras wrote: > > On Sun, Sep 22, 2013 at 03:32:53PM +0300, Gleb Natapov wrote: > > > On Tue, Sep 17, 2013 at 07:18:40PM +1000, Paul Mackerras wrote: > > > > This implements a simple way to express the case of IRQ routing where > > > > there is one in-kernel PIC and the system interrupts (GSIs) are routed > > > > 1-1 to the corresponding pins of the PIC. This is expressed by having > > > > kvm->irq_routing == NULL with a skeleton irq routing entry in the new > > > > kvm->default_irq_route field. > > > > > > > > This provides a convenient way to provide backwards compatibility when > > > > adding IRQ routing capability to an existing in-kernel PIC, such as the > > > > XICS emulation on powerpc. > > > > > > > Why not create simple 1-1 irq routing table? It will take a little bit > > > more memory, but there will be no need for kvm->irq_routing == NULL > > > special handling. > > > > The short answer is that userspace wants to use interrupt source > > numbers (i.e. pin numbers for the inputs to the emulated XICS) that > > are scattered throughout a large space, since that mirrors what real > > hardware does. More specifically, hardware divides up the interrupt > > source number into two fields, each of typically 12 bits, where the > > more significant field identifies an "interrupt source unit" (ISU) and > > the less significant field identifies an interrupt within the ISU. > > Each PCI host bridge would have an ISU, for example, and there can be > > ISUs associated with other things that attach directly to the > > interconnect fabric (coprocessors, cluster interconnects, etc.). > > > > Today, QEMU creates a virtual ISU numbered 1 for the emulated PCI host > > bridge, which means for example that virtio devices get interrupt pin > > numbers starting at 4096. > > > > So, I could have increased KVM_IRQCHIP_NUM_PINS to some multiple of > > 4096, say 16384, which would allow for 3 ISUs. But that would bloat > > out struct kvm_irq_routing_table to over 64kB, and if I wanted 1-1 > > mappings between GSI and pins for all of them, the routing table would > > be over 960kB. > > > Yes, this is not an option. GSI is just a cookie for anything but x86 > non MSI interrupts. So the way to use irq routing table to deliver XICS irqs > is to register GSI->XICS irq mapping and by triggering "GSI", which is > just an arbitrary number, userspace tells kernel that XICS irq, that was > registered for that GSI, should be injected. Yes, that's fine as far as it goes, but the trouble is that the existing data structures (specifically the chip[][] array in struct kvm_irq_routing_table) don't handle well the case where the pin numbers are large and/or sparse. In other words, using a small compact set of GSI numbers wouldn't help, because it's not the GSI -> pin mapping that is the problem, it is the reverse pin -> GSI mapping. > > There is a compatibility concern too -- if I want existing userspace > > to run, I would have to create 1-1 default mappings for at least the > > first (say) 4200 pins or so, which would use up about 294kB. That > > really doesn't seem worth it compared to just using the null routing > > table pointer to indicate an unlimited 1-1 mapping. > > > Let me check that I understand you correctly. Exiting userspace already > uses XICS irq directly and now you want to switch this interface to use > irq routing. New userspace will register GSI->XICS irq mapping like > described above, this is what [2/2] does. Is this correct? I don't particularly want irq routing, but it appears to be unavoidable if one wants IRQFD, which I do want. There is no particular advantage to userspace in using a GSI -> XICS irq mapping. If userspace did set one up it would be the identity mapping. Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html