On 10/28/21 11:30 PM, Mel Gorman wrote:
That aside though, the configuration space could be better. It's possible
to selectively disable NUMA balance but not selectively enable because
prctl is disabled if global NUMA balancing is disabled. That could be
somewhat achieved by having a default value for mm->numa_balancing based on
whether the global numa balancing is disabled via command line or sysctl
and enabling the static branch if prctl is used with an informational
message. This is not the only potential solution but as it stands,
there are odd semantic corner cases. For example, explicit enabling
of NUMA balancing by prctl gets silently revoked if numa balancing is
disabled via sysctl and prctl(PR_NUMA_BALANCING, PR_SET_NUMA_BALANCING,
1) means nothing.
static void task_tick_fair(struct rq *rq, struct task_struct *curr, int
queued)
{
...
if (static_branch_unlikely(&sched_numa_balancing))
task_tick_numa(rq, curr);
...
}
static void task_tick_numa(struct rq *rq, struct task_struct *curr)
{
...
if (!READ_ONCE(curr->mm->numa_balancing))
return;
...
}
When global numa_balancing is disabled, mm->numa_balancing is useless.
So I think prctl(PR_NUMA_BALANCING, PR_SET_NUMA_BALANCING,0/1) should
return error instead of modify mm->numa_balancing.
Is it reasonable that prctl(PR_NUMA_BALANCING,PR_SET_NUMA_BALANCING,0/1)
can still change the value of mm->numa_balancing when global
numa_balancing is disabled?