On 09/11/19 08:05, Wanpeng Li wrote: > From: Wanpeng Li <wanpengli@xxxxxxxxxxx> > > This patch tries to optimize x2apic physical destination mode, fixed delivery > mode single target IPI by delivering IPI to receiver immediately after sender > writes ICR vmexit to avoid various checks when possible. > > Testing on Xeon Skylake server: > > The virtual IPI latency from sender send to receiver receive reduces more than > 330+ cpu cycles. > > Running hackbench(reschedule ipi) in the guest, the avg handle time of MSR_WRITE > caused vmexit reduces more than 1000+ cpu cycles: > > Before patch: > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > MSR_WRITE 5417390 90.01% 16.31% 0.69us 159.60us 1.08us > > After patch: > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > MSR_WRITE 6726109 90.73% 62.18% 0.48us 191.27us 0.58us Do you have retpolines enabled? The bulk of the speedup might come just from the indirect jump. Paolo