On 26/04/23 13:51, Yury Norov wrote: >> I realized I only wrote half the relevant code - comparing node IDs is >> meaningless, I meant to compare distances as we walk through the >> CPUs... I tested the below against a few NUMA topologies and it seems to be >> sane: >> >> @@ -756,12 +773,23 @@ static void __init test_for_each_numa(void) >> { >> unsigned int cpu, node; >> >> - for (node = 0; node < sched_domains_numa_levels; node++) { >> - unsigned int hop, c = 0; >> + for_each_node(node) { >> + unsigned int start_cpu, prev_dist, hop = 0; >> + >> + cpu = cpumask_first(cpumask_of_node(node)); >> + prev_dist = node_distance(node, node); >> + start_cpu = cpu; >> >> rcu_read_lock(); >> - for_each_numa_cpu(cpu, hop, node, cpu_online_mask) >> - expect_eq_uint(cpumask_local_spread(c++, node), cpu); >> + >> + /* Assert distance is monotonically increasing */ >> + for_each_numa_cpu(cpu, hop, node, cpu_online_mask) { >> + unsigned int dist = node_distance(cpu_to_node(cpu), cpu_to_node(start_cpu)); > > Interestingly, node_distance() is an arch-specific function. Generic > implementation is quite useless: > > #define node_distance(from,to) ((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE) > > Particularly, arm64 takes the above. With node_distance() implemented > like that, we can barely test something... > riscv and arm64 rely on drivers/base/arch_numa.c to provide __node_distance() (cf. CONFIG_GENERIC_ARCH_NUMA). x86, sparc, powerpc and ia64 define __node_distance() loongarch and mips define their own node_distance(). So all of those archs will have a usable node_distance(), the others won't and that means the scheduler can't do anything about it - the scheduler relies on node_distance() to understand the topolgoy!