On Wed 06-05-20 15:33:36, Vlastimil Babka wrote: > On 5/4/20 12:26 PM, Michal Hocko wrote: > > On Mon 04-05-20 12:33:04, Sandipan Das wrote: > >> For unpopulated zones, the pagesets point to the common > >> boot_pageset which can have non-zero vm_numa_stat counts. > >> Because of this memory-less nodes end up having non-zero > >> NUMA statistics. This can be observed on any architecture > >> that supports memory-less NUMA nodes. > >> > >> E.g. > >> > >> $ numactl -H > >> available: 2 nodes (0-1) > >> node 0 cpus: 0 1 2 3 > >> node 0 size: 0 MB > >> node 0 free: 0 MB > >> node 1 cpus: 4 5 6 7 > >> node 1 size: 8131 MB > >> node 1 free: 6980 MB > >> node distances: > >> node 0 1 > >> 0: 10 40 > >> 1: 40 10 > >> > >> $ numastat > >> node0 node1 > >> numa_hit 108 56495 > >> numa_miss 0 0 > >> numa_foreign 0 0 > >> interleave_hit 0 4537 > >> local_node 108 31547 > >> other_node 0 24948 > >> > >> Hence, return zero explicitly for all the stats of an > >> unpopulated zone. > > > > I hope I am not just confused but I would expect that at least > > numa_foreign and other_node to be non zero. > Hmm, checking zone_statistics(): > > NUMA_FOREIGN increment uses preferred zone, which is the first in zone in > zonelist, so it will be a zone from node 1 even for allocations on cpu > associated to node 0 - assuming node 0's unpopulated zones are not included in > node 0's zonelist. But the allocation could have been requested for node 0 regardless of the amount of memory the node has. > NUMA_OTHER uses numa_node_id(), which would mean the node 0's cpus have node 1 > in their numa_node_id() ? Is that correct? numa_node_id should reflect the real node the CPU is associated with. -- Michal Hocko SUSE Labs