The existed implementation of NUMA counters is per logical CPU along with zone->vm_numa_stat[] separated by zone, plus a global numa counter array vm_numa_stat[]. However, unlike the other vmstat counters, NUMA stats don't effect system's decision and are only consumed when reading from /proc and /sys. Also, usually nodes only have a single zone, except for node 0, and there isn't really any use where you need these hits counts separated by zone. Therefore, we can migrate the implementation of numa stats from per-zone to per-node (as suggested by Andi Kleen), and reuse the existed per-cpu infrastructure with a little enhancement for NUMA stats. In this way, we can get rid of the special way for NUMA stats and keep the performance gain at the same time. With this patch series, about 170 lines code can be saved. The first patch migrates NUMA stats from per-zone to pre-node using the existed per-cpu infrastructure. There is a little user-visual change when read /proc/zoneinfo listed below: Before After Node 0, zone DMA Node 0, zone DMA per-node stats per-node stats nr_inactive_anon 7244 *numa_hit 98665086* nr_active_anon 177064 *numa_miss 0* ... *numa_foreign 0* nr_bounce 0 *numa_interleave 21059* nr_free_cma 0 *numa_local 98665086* *numa_hit 0* *numa_other 0* *numa_miss 0* nr_inactive_anon 20055 *numa_foreign 0* nr_active_anon 389771 *numa_interleave 0* ... *numa_local 0* nr_bounce 0 *numa_other 0* nr_free_cma 0 The second patch extends the local cpu counter vm_stat_node_diff from s8 to s16. It does not have any functionality change. The third patch uses a large and constant threshold size for NUMA counters to reduce the global NUMA counters update frequency. The forth patch uses node_page_state_snapshot instead of node_page_state when query a node stats (e.g. cat /sys/devices/system/node/node*/vmstat). The only differece is that the stats value in local cpus are also included in node_page_state_snapshot. The last patch renames zone_statistics() to numa_statistics(). At last, I want to extend my heartiest appreciation for Michal Hocko's suggestion of reusing the existed per-cpu infrastructure making it much better than before. Changelog: v1->v2: a) enhance the existed per-cpu infrastructure for node page stats by entending local cpu counters vm_node_stat_diff from s8 to s16 b) reuse the per-cpu infrastrcuture for NUMA stats Kemi Wang (5): mm: migrate NUMA stats from per-zone to per-node mm: Extends local cpu counter vm_diff_nodestat from s8 to s16 mm: enlarge NUMA counters threshold size mm: use node_page_state_snapshot to avoid deviation mm: Rename zone_statistics() to numa_statistics() drivers/base/node.c | 28 +++---- include/linux/mmzone.h | 31 ++++---- include/linux/vmstat.h | 31 -------- mm/mempolicy.c | 2 +- mm/page_alloc.c | 22 +++--- mm/vmstat.c | 206 +++++++++---------------------------------------- 6 files changed, 74 insertions(+), 246 deletions(-) -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>