This patch introduced numa execution information, to imply the numa efficiency. By doing 'cat /sys/fs/cgroup/memory/CGROUP_PATH/memory.numa_stat', we see new output line heading with 'exectime', like: exectime 24399843 27865444 which means the tasks of this cgroup executed 24399843 ticks on node 0, and 27865444 ticks on node 1. Combined with the memory node info, we can estimate the numa efficiency, for example the memory.numa_stat show: total=4613257 N0=6849 N1=3928327 ... exectime 24399843 27865444 there could be unmovable or cache pages on N1, then good locality could mean nothing since we are not tracing these type of pages, thus bind the workloads on the cpus of N1 worth a try, in order to achieve the maximum performance bonus. Signed-off-by: Michael Wang <yun.wang@xxxxxxxxxxxxxxxxx> --- include/linux/memcontrol.h | 1 + mm/memcontrol.c | 13 +++++++++++++ 2 files changed, 14 insertions(+) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index bb62e6294484..e784d6252d5e 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -197,6 +197,7 @@ enum memcg_numa_locality_interval { struct memcg_stat_numa { u64 locality[NR_NL_INTERVAL]; + u64 exectime; }; #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b810d4e9c906..91bcd71fc38a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3409,6 +3409,18 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v) seq_printf(m, " %llu", sum); } seq_putc(m, '\n'); + + seq_puts(m, "exectime"); + for_each_online_node(nr) { + int cpu; + u64 sum = 0; + + for_each_cpu(cpu, cpumask_of_node(nr)) + sum += per_cpu(memcg->stat_numa->exectime, cpu); + + seq_printf(m, " %llu", sum); + } + seq_putc(m, '\n'); #endif return 0; @@ -3437,6 +3449,7 @@ void memcg_stat_numa_update(struct task_struct *p) memcg = mem_cgroup_from_task(p); if (idx != -1) this_cpu_inc(memcg->stat_numa->locality[idx]); + this_cpu_inc(memcg->stat_numa->exectime); rcu_read_unlock(); } #endif -- 2.14.4.44.g2045bb6