On Fri, 20 Jul 2018 at 00:47, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 19/07/2018 18:28, Radim Krčmář wrote: > >> + > >> + kvm_hypercall3(KVM_HC_SEND_IPI, ipi_bitmap_low, ipi_bitmap_high, vector); > > and > > > > kvm_hypercall3(KVM_HC_SEND_IPI, ipi_bitmap[0], ipi_bitmap[1], vector); > > > > Still, the main problem is that we can only address 128 APICs. > > > > A simple improvement would reuse the vector field (as we need only 8 > > bits) and put a 'offset' in the rest. The offset would say which > > cluster of 128 are we addressing. 24 bits of offset results in 2^31 > > total addressable CPUs (we probably should even use that many bits). > > The downside of this is that we can only address 128 at a time. > > > > It's basically the same as x2apic cluster mode, only with 128 cluster > > size instead of 16, so the code should be a straightforward port. > > And because x2apic code doesn't seem to use any division by the cluster > > size, we could even try to use kvm_hypercall4, add ipi_bitmap[2], and > > make the cluster size 192. :) > > I did suggest an offset earlier in the discussion. > > The main problem is that consecutive CPU ids do not map to consecutive > APIC ids. But still, we could do an hypercall whenever the total range > exceeds 64. Something like > > u64 ipi_bitmap = 0; > for_each_cpu(cpu, mask) > if (!ipi_bitmap) { > min = max = cpu; > } else if (cpu < min && max - cpu < 64) { > ipi_bitmap <<= min - cpu; > min = cpu; > } else if (id < min + 64) { > max = cpu < max ? max : cpu; > } else { > /* ... send hypercall... */ > min = max = cpu; > ipi_bitmap = 0; > } > __set_bit(ipi_bitmap, cpu - min); > } > if (ipi_bitmap) { > /* ... send hypercall... */ > } > > We could keep the cluster size of 128, but it would be more complicated > to do the left shift in the first "else if". If the limit is 64, you > can keep the two arguments in the hypercall, and just pass 0 as the > "high" bitmap on 64-bit kernels. As David pointed out, we need to scale to higher APIC IDs. I will add the cpu id to apic id transfer in the for loop. How about: kvm_hypercall2(KVM_HC_SEND_IPI, ipi_bitmap, vector); directly. In addition, why need to pass the 0 as the "high" bitmap even if for 128 vCPUs case? Regards, Wanpeng Li