On 25/09/22 07:58, Yury Norov wrote: > On Fri, Sep 23, 2022 at 04:55:41PM +0100, Valentin Schneider wrote: >> +/** >> + * for_each_numa_hop_cpu - iterate over CPUs by increasing NUMA distance, >> + * starting from a given node. >> + * @cpu: the iteration variable. >> + * @node: the NUMA node to start the search from. >> + * >> + * Requires rcu_lock to be held. >> + * Careful: this is a double loop, 'break' won't work as expected. > > This warning concerns me not only because new iteration loop hides > complexity and breaks 'break' (sic!), but also because it looks too > specific. Why don't you split it, so instead: > > for_each_numa_hop_cpu(cpu, dev->priv.numa_node) { > cpus[i] = cpu; > if (++i == ncomp_eqs) > goto spread_done; > } > > in the following patch you would have something like this: > > for_each_node_hop(hop, node) { > struct cpumask hop_cpus = sched_numa_hop_mask(node, hop); > > for_each_cpu_andnot(cpu, hop_cpus, ...) { > cpus[i] = cpu; > if (++i == ncomp_eqs) > goto spread_done; > } > } > > It looks more bulky, but I believe there will be more users for > for_each_node_hop() alone. > > On top of that, if you really like it, you can implement > for_each_numa_hop_cpu() if you want. > IIUC you're suggesting to introduce an iterator for the cpumasks first, and then maybe add one on top for the individual cpus. I'm happy to do that, though I have to say I'm keen to keep the CPU iterator - IMO the complexity is justified if it is centralized in one location and saves us from boring old boilerplate code. >> + * Implementation notes: >> + * >> + * Providing it is valid, the mask returned by >> + * sched_numa_hop_mask(node, hops+1) >> + * is a superset of the one returned by >> + * sched_numa_hop_mask(node, hops) >> + * which may not be that useful for drivers that try to spread things out and >> + * want to visit a CPU not more than once. >> + * >> + * To accommodate for that, we use for_each_cpu_andnot() to iterate over the cpus >> + * of sched_numa_hop_mask(node, hops+1) with the CPUs of >> + * sched_numa_hop_mask(node, hops) removed, IOW we only iterate over CPUs >> + * a given distance away (rather than *up to* a given distance). >> + * >> + * hops=0 forces us to play silly games: we pass cpu_none_mask to >> + * for_each_cpu_andnot(), which turns it into for_each_cpu(). >> + */ >> +#define for_each_numa_hop_cpu(cpu, node) \ >> + for (struct { const struct cpumask *curr, *prev; int hops; } __v = \ >> + { sched_numa_hop_mask(node, 0), NULL, 0 }; \ > > This anonymous structure is never used as structure. What for you > define it? Why not just declare hops, prev and curr without packing > them? > I haven't found a way to do this that doesn't involve a struct - apparently you can't mix types in a for loop declaration clause. > Thanks, > Yury >