On Thu, 2019-12-12 at 12:27 +0100, Sebastian Andrzej Siewior wrote: > If user task changes the CPU affinity mask of a running task it will > dispatch migration request if the current CPU is no longer allowed. This > might happen shortly before a task enters a migrate_disable() section. > Upon leaving the migrate_disable() section, the task will notice that > the current CPU is no longer allowed and will will dispatch its own > migration request to move it off the current CPU. > While invoking __schedule() the first migration request will be > processed and the task returns on the "new" CPU with "arg.done = 0". Its > own migration request will be processed shortly after and will result in > memory corruption if the stack memory, designed for request, was used > otherwise in the meantime. Ugh. > Spin until the migration request has been processed if it was accepted. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > --- > kernel/sched/core.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 8bea013b2baf5..5c7be96ca68c4 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -8227,7 +8227,7 @@ void migrate_enable(void) > > WARN_ON(smp_processor_id() != cpu); > if (!is_cpu_allowed(p, cpu)) { > - struct migration_arg arg = { p }; > + struct migration_arg arg = { .task = p }; > struct cpu_stop_work work; > struct rq_flags rf; > > @@ -8239,7 +8239,10 @@ void migrate_enable(void) > stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop, > &arg, &work); > __schedule(true); > - WARN_ON_ONCE(!arg.done && !work.disabled); > + if (!work.disabled) { > + while (!arg.done) > + cpu_relax(); > + } We should enable preemption while spinning -- besides the general badness of spinning with it disabled, there could be deadlock scenarios if multiple CPUs are spinning in such a loop. Long term maybe have a way to dequeue the no-longer-needed work instead of waiting. -Scott