On Tue, May 01, 2018 at 04:10:53PM +0530, Kohli, Gaurav wrote: > Yes with loop, it will reset TASK_PARKED but that is not happening in the > dumps we have seen. But was that with or without the fixed wait-loop? I don't care about stuff you might have seen with the current code, that is clearly broken. > > takedown_cpu() can proceed beyond smpboot_park_threads() and kill the > > CPU before any of the threads are parked -- per having the complete() > > before hitting schedule(). > > > > And, afaict, that is harmless. When we go offline, sched_cpu_dying() -> > > migrate_tasks() will migrate any still runnable threads off the cpu. > > But because at this point the thread must be in the PARKED wait-loop, it > > will hit schedule() and go to sleep eventually. > > > > Also note that kthread_unpark() does __kthread_bind() to rebind the > > threads. > > > > Aaaah... I think I've spotted a problem there. We clear SHOULD_PARK > > before we rebind, so if the thread lost the first PARKED store, > > does the completion, gets migrated, cycles through the loop and now > > observes !SHOULD_PARK and bails the wait-loop, then __kthread_bind() > > will forever wait. > > > > So during next unpark > __kthread_unpark -> __kthread_bind -> wait_task_inactive (this got failed, > as current state is running so failed on below call: Aah, yes, I seem to have mis-remembered how wait_task_inactive() works. And it is indeed still a problem.. Let me ponder what the best solution is, it's a bit of a mess. -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html