On Fri, Dec 09, 2016 at 04:08:00PM +0000, Andre Przywara wrote: > On 08/12/16 17:50, Andrew Jones wrote: > > +u32 gicv3_iar_irqnr(u32 iar) > > +{ > > + return iar; > > I am probably a bit paranoid here, but the spec says that the interrupt > ID is in bits[23:0] only (at most). Indeed, I'll add '& ((1 << 24) - 1' here. > > > +} > > + > > +void gicv3_ipi_send_mask(int irq, const cpumask_t *dest) > > +{ > > + u16 tlist; > > + int cpu; > > + > > + assert(irq < 16); > > + > > + /* > > + * For each cpu in the mask collect its peers, which are also in > > + * the mask, in order to form target lists. > > + */ > > + for_each_cpu(cpu, dest) { > > + u64 mpidr = cpus[cpu], sgi1r; > > + u64 cluster_id; > > + > > + /* > > + * GICv3 can send IPIs to up 16 peer cpus with a single > > + * write to ICC_SGI1R_EL1 (using the target list). Peers > > + * are cpus that have nearly identical MPIDRs, the only > > + * difference being Aff0. The matching upper affinity > > + * levels form the cluster ID. > > + */ > > + cluster_id = mpidr & ~0xffUL; > > + tlist = 0; > > + > > + /* > > + * Sort of open code for_each_cpu in order to have a > > + * nested for_each_cpu loop. > > + */ > > + while (cpu < nr_cpus) { > > + if ((mpidr & 0xff) >= 16) { > > + printf("cpu%d MPIDR:aff0 is %d (>= 16)!\n", > > + cpu, (int)(mpidr & 0xff)); > > + break; > > + } > > + > > + tlist |= 1 << (mpidr & 0xf); > > + > > + cpu = cpumask_next(cpu, dest); > > + if (cpu >= nr_cpus) > > + break; > > + > > + mpidr = cpus[cpu]; > > + > > + if (cluster_id != (mpidr & ~0xffUL)) { > > + /* > > + * The next cpu isn't in our cluster. Roll > > + * back the cpu index allowing the outer > > + * for_each_cpu to find it again with > > + * cpumask_next > > + */ > > + --cpu; > > + break; > > + } > > + } > > + > > + /* Send the IPIs for the target list of this cluster */ > > + sgi1r = (MPIDR_TO_SGI_AFFINITY(cluster_id, 3) | > > + MPIDR_TO_SGI_AFFINITY(cluster_id, 2) | > > + irq << 24 | > > + MPIDR_TO_SGI_AFFINITY(cluster_id, 1) | > > + tlist); > > + > > + gicv3_write_sgi1r(sgi1r); > > + } > > + > > + /* Force the above writes to ICC_SGI1R_EL1 to be executed */ > > + isb(); > > +} > > Wow, this is really heavy stuff, especially for a Friday afternoon ;-) > But I convinced myself that it's correct. The only issue is that it's > sub-optimal if the MPIDRs of the VCPUs are not in order, say: 0x000, > 0x100, 0x001. > In this case we do three register writes instead of the minimal two. > But it's still correct, so it's actually a minor nit just to prove that > I checked the algorithm ;-) > > So apart from the minor comment above: > > Reviewed-by: Andre Przywara <andre.przywara@xxxxxxx> Thanks! drew _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm