On Fri, Aug 19, 2016 at 03:35:44PM +1000, Paul Mackerras wrote: > This patch set reduces the latency for presenting interrupts from PCI > pass-through devices to a Book3S HV guest. Currently, if an interrupt > arrives from a PCI pass-through device while a guest is running, it > causes an exit of all threads on the core to the host, where the > interrupt is handled by making an interrupt pending in the virtual > XICS interrupt controller for the guest that owns the device. > Furthermore, there is currently no attempt to direct PCI pass-through > device interrupts to the physical core where the VCPU that they are > directed to is running, so they often land on a different core and > require an IPI to interrupt the VCPU. > > With this patch set, if the interrupt arrives on a core where the > correct guest is running, it can be handled in hypervisor real mode > without needing an exit to host context. If the destination VCPU is > on the same core, then we can interrupt it using at most a msgsnd > (message send) instruction, which is considerably faster than an IPI. > > Further, if an interrupt arrives on a different core, we then change > the destination for the interrupt in the physical interrupt controller > to point to the core where the VCPU is running. For now, we always > direct the interrupt to thread 0 of the core because the other threads > are offline from the point of view of the host, and the offline loop > (which is where those other threads run when thread 0 is in host > context) doesn't handle device interrupts. > > This patch set is based on a patch set from Suresh Warrier, with > considerable revision by me. The data structure for mapping host > interrupt numbers to guest interrupt numbers is just a flat array that > is searched linearly, which works and is simple but could perform > poorly with large numbers of interrupt sources. It would be simple to > replace this mapping array with a more sophisticated data structure in > future. > > To test the performance of this patch set, I used a network one-byte > ping-pong test between a guest with a Mellanox CX-3 passed through to > it, connected over 10Gb ethernet to another POWER8 system running > bare-metal with a Chelsio 10Gb ethernet adapter. (The guest was > running Ubuntu 16.04.1 under QEMU v2.7-rc2 on a POWER8.) Without this > patchset, the round-trip latency was 43us, and with it the latency was > 41us, a saving of 2us per round-trip. Series applied to my kvm-ppc-next branch. Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html