On 6 Sep 2022 20:54:58 +0800 Kuyo Chang <kuyo.chang@xxxxxxxxxxxx> wrote > Hi, > > [Syndrome] > A list_add corruption error at kernel-5.15, the log shows. > list_add corruption. prev->next should be next (ffffff81a6f08ba0), but > was 0000000000000000. (prev=ffffff81a6f05930). > > The call trace as below: > ipanic_die > notify_die > die > bug_handler > brk_handler > do_debug_exception > el1_dbg > el1h_64_sync_handler > el1h_64_sync > __list_add_valid > cpu_stop_queue_work > stop_one_cpu_nowait > balance_push > __schedule > schedule > do_sched_yield > __arm64_sys_sched_yield > invoke_syscall > el0_svc_common > do_el0_svc > el0_svc > el0t_64_sync_handler > el0t_64_sync > > [Analysis] > By memory dump and analyzing the stopper->works list, the error code > flow as following: > > migrate_swap > ->stop_two_cpus > ->cpu_stop_queue_two_works > ->__cpu_stop_queue_work (add work->list to stopper- > >works respectively) > ->list_add_tail(&work->list, &stopper->works); > ->wake_up_q(&wakeq); > ->wait_for_completion(&done.completion); > ->wait_for_common > ->schedule_timeout > ->schedule > > At this point, the cpu hotplug trigged, > It registers balance_callback by below flow: > cpu_down(cpuid) > ->_cpu_down > ->cpuhp_set_state() > ->set_cpu_dying(cpuid, true) > ->sched_cpu_deactivate > ->balance_push_set(cpuid, true) > ->rq->balance_callback = &balance_push_callback; > > > Finally, > ->__schedule > ->__balance_callbacks > ->do_balance_callbacks(rq, __splice_balance_callbacks(rq, false)); > ->balance_push > ->stop_one_cpu_nowait > *work_buf = (struct cpu_stop_work){ .fn = fn, .arg = arg, > .caller = _RET_IP_, }; > At this point the list_head *next, *prev is initial to NULL!! > ->cpu_stop_queue_work > ->__list_add_valid > > Do you have any suggestion for this issue? See if making balance_push() non re-entrable removes the chance for double list add in your case. Hillf --- linux-5.15/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8815,6 +8815,7 @@ static int __balance_push_cpu_stop(void cpu = select_fallback_rq(rq->cpu, p); rq = __migrate_task(rq, &rf, p, cpu); } + this_cpu_ptr(&push_work)->queued = 0; rq_unlock(rq, &rf); raw_spin_unlock_irq(&p->pi_lock); @@ -8838,6 +8839,8 @@ static void balance_push(struct rq *rq) lockdep_assert_rq_held(rq); + if (WARN_ON_ONCE(this_cpu_ptr(&push_work)->queued != 0)) + return; /* * Ensure the thing is persistent until balance_push_set(.on = false); */ @@ -8877,6 +8880,7 @@ static void balance_push(struct rq *rq) return; } + this_cpu_ptr(&push_work)->queued = 1; get_task_struct(push_task); /* * Temporarily drop rq->lock such that we can wake-up the stop task. --- a/include/linux/stop_machine.h +++ b/include/linux/stop_machine.h @@ -27,6 +27,7 @@ struct cpu_stop_work { unsigned long caller; void *arg; struct cpu_stop_done *done; + unsigned queued; }; int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg);