On Mon, Dec 06, 2021 at 10:45:28AM +0800, Gang Li wrote: > This patch add a new api PR_NUMA_BALANCING in prctl. > > A large number of page faults will cause performance loss when numa > balancing is performing. Thus those processes which care about worst-case > performance need numa balancing disabled. Others, on the contrary, allow a > temporary performance loss in exchange for higher average performance, so > enable numa balancing is better for them. > > Numa balancing can only be controlled globally by > /proc/sys/kernel/numa_balancing. Due to the above case, we want to > disable/enable numa_balancing per-process instead. > > Add numa_balancing under mm_struct. Then use it in task_tick_fair. > > Set per-process numa balancing: > prctl(PR_NUMA_BALANCING, PR_SET_NUMAB_DISABLE); //disable > prctl(PR_NUMA_BALANCING, PR_SET_NUMAB_ENABLE); //enable > prctl(PR_NUMA_BALANCING, PR_SET_NUMAB_DEFAULT); //follow global This seems to imply you can prctl(ENABLE) even if the global is disabled, IOW sched_numa_balancing is off. > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 884f29d07963..2980f33ac61f 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -11169,8 +11169,12 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued) > entity_tick(cfs_rq, se, queued); > } > > - if (static_branch_unlikely(&sched_numa_balancing)) > +#ifdef CONFIG_NUMA_BALANCING > + if (curr->mm && (curr->mm->numab_enabled == NUMAB_ENABLED > + || (static_branch_unlikely(&sched_numa_balancing) > + && curr->mm->numab_enabled == NUMAB_DEFAULT))) > task_tick_numa(rq, curr); > +#endif > > update_misfit_status(curr, rq); > update_overutilized_status(task_rq(curr)); There's just about everything wrong there... not least of all the horrific coding style.