For unpopulated zones, the pagesets point to the common boot_pageset which can have non-zero vm_numa_stat counts. Because of this memory-less nodes end up having non-zero NUMA statistics. This can be observed on any architecture that supports memory-less NUMA nodes. E.g. $ numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: 4 5 6 7 node 1 size: 8131 MB node 1 free: 6980 MB node distances: node 0 1 0: 10 40 1: 40 10 $ numastat node0 node1 numa_hit 108 56495 numa_miss 0 0 numa_foreign 0 0 interleave_hit 0 4537 local_node 108 31547 other_node 0 24948 Hence, return zero explicitly for all the stats of an unpopulated zone. Signed-off-by: Sandipan Das <sandipan@xxxxxxxxxxxxx> --- include/linux/vmstat.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index 292485f3d24d..55a68b379a2c 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -159,6 +159,21 @@ static inline unsigned long zone_numa_state_snapshot(struct zone *zone, long x = atomic_long_read(&zone->vm_numa_stat[item]); int cpu; + /* + * Initially, the pageset of all zones are set to point to the + * boot_pageset. The real pagesets are allocated later but only + * for the populated zones. Unpopulated zones still continue + * using the boot_pageset. + * + * Before the real pagesets are allocated, the boot_pageset's + * vm_numa_stat counters can get incremented. This affects the + * unpopulated zones which end up with non-zero stats despite + * having no memory associated with them. For such cases, + * return zero explicitly. + */ + if (!populated_zone(zone)) + return 0; + for_each_online_cpu(cpu) x += per_cpu_ptr(zone->pageset, cpu)->vm_numa_stat_diff[item]; -- 2.17.1