Hi. On Fri, Dec 13, 2019 at 09:47:36AM +0800, 王贇 <yun.wang@xxxxxxxxxxxxxxxxx> wrote: > By monitoring the increments, we will be able to locate the per-cgroup > workload which NUMA Balancing can't helpwith (usually caused by wrong > CPU and memory node bindings), then we got chance to fix that in time. I just wonder do the data based on increments match with those you obtained previously? > +static inline void > +update_task_locality(struct task_struct *p, int pnid, int cnid, int pages) > +{ > + if (!static_branch_unlikely(&sched_numa_locality)) > + return; > + > + /* > + * pnid != cnid --> remote idx 0 > + * pnid == cnid --> local idx 1 > + */ > + p->numa_page_access[!!(pnid == cnid)] += pages; If the per-task information isn't used anywhere, why not accumulate directly into task's cfs_rq->{local,remote}_page_access? > @@ -4298,6 +4359,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued) > */ > update_load_avg(cfs_rq, curr, UPDATE_TG); > update_cfs_group(curr); > + update_group_locality(cfs_rq); With the per-NUMA node time tracked separately, isn't it unnecessary doing group updates inside entity_tick? Regards, Michal