Re: [PATCH v6 1/2] sched/numa: introduce per-cgroup NUMA locality info

Michal Koutný <mkoutny@xxxxxxxx> · Fri, 3 Jan 2020 16:14:49 +0100

Hi.

On Fri, Dec 13, 2019 at 09:47:36AM +0800, 王贇 <yun.wang@xxxxxxxxxxxxxxxxx> wrote:
> By monitoring the increments, we will be able to locate the per-cgroup
> workload which NUMA Balancing can't helpwith (usually caused by wrong
> CPU and memory node bindings), then we got chance to fix that in time.
I just wonder do the data based on increments match with those you
obtained previously?

> +static inline void
> +update_task_locality(struct task_struct *p, int pnid, int cnid, int pages)
> +{
> +	if (!static_branch_unlikely(&sched_numa_locality))
> +		return;
> +
> +	/*
> +	 * pnid != cnid --> remote idx 0
> +	 * pnid == cnid --> local idx 1
> +	 */
> +	p->numa_page_access[!!(pnid == cnid)] += pages;
If the per-task information isn't used anywhere, why not accumulate
directly into task's cfs_rq->{local,remote}_page_access?

> @@ -4298,6 +4359,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
>  	 */
>  	update_load_avg(cfs_rq, curr, UPDATE_TG);
>  	update_cfs_group(curr);
> +	update_group_locality(cfs_rq);
With the per-NUMA node time tracked separately, isn't it unnecessary
doing group updates inside entity_tick? 

Regards,
Michal