Optimize the XICS emulation code in KVM as per the 'performance todos' in the comments of book3s_xics.c. Performance numbers: 1. Test case: Pgbench run in a KVM on PowerVM guest for 120 secs 2. Time taken by arch_send_call_function_single_ipi() currently measured with funclatency [1]. $ ./funclatency.py -u arch_send_call_function_single_ipi usecs : count distribution 0 -> 1 : 7 | | 2 -> 3 : 16 | | 4 -> 7 : 141 | | 8 -> 15 : 4455631 |****************************************| 16 -> 31 : 437981 |*** | 32 -> 63 : 5036 | | 64 -> 127 : 92 | | avg = 12 usecs, total: 60,532,481 usecs, count: 4,898,904 3. Time taken by arch_send_call_function_single_ipi() with changes in this series. $ ./funclatency.py -u arch_send_call_function_single_ipi usecs : count distribution 0 -> 1 : 15 | | 2 -> 3 : 7 | | 4 -> 7 : 3798 | | 8 -> 15 : 4569610 |****************************************| 16 -> 31 : 339284 |** | 32 -> 63 : 4542 | | 64 -> 127 : 68 | | 128 -> 255 : 0 | | 256 -> 511 : 1 | | avg = 11 usecs, total: 57,720,612 usecs, count: 4,917,325 4. This patch series has been also tested on KVM on Power8 CPU. [1]: https://github.com/iovisor/bcc/blob/master/tools/funclatency.py Changes v1 -> v1 resend 1. Add Cedric to CC Gautam Menghani (3): arch/powerpc/kvm: Use bitmap to speed up resend of irqs in ICS arch/powerpc/kvm: Optimize the server number -> ICP lookup arch/powerpc/kvm: Reduce lock contention by moving spinlock from ics to irq_state arch/powerpc/kvm/book3s_hv_rm_xics.c | 8 ++-- arch/powerpc/kvm/book3s_xics.c | 70 ++++++++++++---------------- arch/powerpc/kvm/book3s_xics.h | 13 ++---- 3 files changed, 39 insertions(+), 52 deletions(-) -- 2.44.0