On Sat, Nov 12, 2022 at 11:09:45AM -0800, Yury Norov wrote: > The function finds Nth set CPU in a given cpumask starting from a given > node. > > Leveraging the fact that each hop in sched_domains_numa_masks includes the > same or greater number of CPUs than the previous one, we can use binary > search on hops instead of linear walk, which makes the overall complexity > of O(log n) in terms of number of cpumask_weight() calls. ... > +struct __cmp_key { > + const struct cpumask *cpus; > + struct cpumask ***masks; > + int node; > + int cpu; > + int w; > +}; > + > +static int cmp(const void *a, const void *b) Calling them key and pivot (as in the caller), would make more sense. > +{ What about const (?) struct cpumask ***masks = (...)pivot; > + struct cpumask **prev_hop = *((struct cpumask ***)b - 1); = masks[-1]; > + struct cpumask **cur_hop = *(struct cpumask ***)b; = masks[0]; ? > + struct __cmp_key *k = (struct __cmp_key *)a; > + if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu) > + return 1; > + k->w = (b == k->masks) ? 0 : cpumask_weight_and(k->cpus, prev_hop[k->node]); > + if (k->w <= k->cpu) > + return 0; Can k->cpu be negative? If no, we can rewrite above as k->w = 0; if (b == k->masks) return 0; k->w = cpumask_weight_and(k->cpus, prev_hop[k->node]); > + return -1; > +} ... > +int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node) > +{ > + struct __cmp_key k = { cpus, NULL, node, cpu, 0 }; You can drop NULL and 0 while using C99 assignments. > + int hop, ret = nr_cpu_ids; > + rcu_read_lock(); + Blank line? > + k.masks = rcu_dereference(sched_domains_numa_masks); > + if (!k.masks) > + goto unlock; > + hop = (struct cpumask ***) > + bsearch(&k, k.masks, sched_domains_numa_levels, sizeof(k.masks[0]), cmp) - k.masks; Strange indentation. I would rather see the split on parameters and maybe '-' operator. sizeof(*k.masks) is a bit shorter, right? Also we may go with struct cpumask ***masks; struct __cmp_key k = { .cpus = cpus, .node = node, .cpu = cpu }; > + ret = hop ? > + cpumask_nth_and_andnot(cpu - k.w, cpus, k.masks[hop][node], k.masks[hop-1][node]) : > + cpumask_nth_and(cpu - k.w, cpus, k.masks[0][node]); > +unlock: out_unlock: shows the intention more clearly, no? > + rcu_read_unlock(); > + return ret; > +} -- With Best Regards, Andy Shevchenko