From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > > When sending an IPI to a single CPU there is no need to deal with cpumasks. > With 2 CPU guest on WS2019 I'm seeing a minor (like 3%, 8043 -> 7761 CPU > cycles) improvement with smp_call_function_single() loop benchmark. The > optimization, however, is tiny and straitforward. Also, send_ipi_one() is > important for PV spinlock kick. > > I was also wondering if it would make sense to switch to using regular > APIC IPI send for CPU > 64 case but no, it is twice as expesive (12650 CPU > cycles for __send_ipi_mask_ex() call, 26000 for orig_apic.send_IPI(cpu, > vector)). > > Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > --- > Changes since v1: > - Style changes [Roman, Joe] > --- > arch/x86/hyperv/hv_apic.c | 13 ++++++++++--- > arch/x86/include/asm/trace/hyperv.h | 15 +++++++++++++++ > 2 files changed, 25 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c > index e01078e93dd3..fd17c6341737 100644 > --- a/arch/x86/hyperv/hv_apic.c > +++ b/arch/x86/hyperv/hv_apic.c > @@ -194,10 +194,17 @@ static bool __send_ipi_mask(const struct cpumask *mask, int > vector) > > static bool __send_ipi_one(int cpu, int vector) > { > - struct cpumask mask = CPU_MASK_NONE; > + trace_hyperv_send_ipi_one(cpu, vector); > > - cpumask_set_cpu(cpu, &mask); > - return __send_ipi_mask(&mask, vector); > + if (!hv_hypercall_pg || (vector < HV_IPI_LOW_VECTOR) || > + (vector > HV_IPI_HIGH_VECTOR)) > + return false; > + > + if (cpu >= 64) > + return __send_ipi_mask_ex(cpumask_of(cpu), vector); The above test should be checking the VP number, not the CPU number, since the VP number is used to form the bitmap argument to the hypercall. In all current implementations of Hyper-V, the CPU number and VP number are the same as far as I am aware, but that's not guaranteed in the future. Michael > + > + return !hv_do_fast_hypercall16(HVCALL_SEND_IPI, vector, > + BIT_ULL(hv_cpu_number_to_vp_number(cpu))); > } >