Hi, Mukesh https://lore.kernel.org/lkml/YvrWaml3F+x9Dk+T@xxxxxxxxxxxxxxx/ is for fix cgroup_threadgroup_rwsem <-> cpus_read_lock() deadlock. But this issue is cgroup_threadgroup_rwsem <-> cpuset_rwsem deadlock. I think they are not same issue. Do the patch is useful for this issue? Best regards, Jing-Ting Wu On Mon, 2022-09-05 at 12:14 +0530, Mukesh Ojha wrote: > This is fixed by this. > > https://lore.kernel.org/lkml/YvrWaml3F+x9Dk+T@xxxxxxxxxxxxxxx/ > > -Mukesh > > On 9/5/2022 8:17 AM, Jing-Ting Wu wrote: > > Hi, > > > > We meet the HANG_DETECT happened in T SW version with kernel-5.15. > > Many tasks have been blocked for a long time. > > > > > > Root cause: > > migration_cpu_stop() is not complete due to > > is_migration_disabled(p) is > > true, complete is false and complete_all() never get executed. > > It let other task wait the rwsem. > > > > Detail: > > system_server waiting for cgroup_threadgroup_rwsem. > > OomAdjuster is holding the cgroup_threadgroup_rwsem and waiting for > > cpuset_rwsem. > > cpuset_hotplug_workfn is holding the cpuset_rwsem and waiting for > > affine_move_task() complete. > > affine_move_task() waiting for migration_cpu_stop() complete. > > > > The backtrace of system_server: > > __switch_to > > __schedule > > schedule > > percpu_rwsem_wait > > __percpu_down_read > > cgroup_css_set_fork => wait for cgroup_threadgroup_rwsem > > cgroup_can_fork > > copy_process > > kernel_clone > > > > The backtrace of OomAdjuster: > > __switch_to > > __schedule > > schedule > > percpu_rwsem_wait > > percpu_down_write > > cpuset_can_attach => wait for cpuset_rwsem > > cgroup_migrate_execute > > cgroup_attach_task > > __cgroup1_procs_write => hold cgroup_threadgroup_rwsem > > cgroup1_procs_write > > cgroup_file_write > > kernfs_fop_write_iter > > vfs_write > > ksys_write > > > > The backtrace of cpuset_hotplug_workfn: > > __switch_to > > __schedule > > schedule > > schedule_timeout > > wait_for_common > > affine_move_task => wait for complete > > __set_cpus_allowed_ptr_locked > > update_tasks_cpumask > > cpuset_hotplug_update_tasks => hold cpuset_rwsem > > cpuset_hotplug_workfn > > process_one_work > > worker_thread > > kthread > > > > > > In affine_move_task() will call migration_cpu_stop() and wait for > > it > > complete. > > In normal case, if migration_cpu_stop() complete it will inform > > everyone that he is done. > > But there is an exception case that will not notify. > > If is_migration_disabled(p) is true and complete will always is > > false, > > then complete_all() never get executed. > > > > static int migration_cpu_stop(void *data) > > { > > ... > > bool complete = false; > > ... > > > > if (task_rq(p) == rq) { > > if (is_migration_disabled(p)) > > goto out; => is_migration_disabled(p) = true, > > so complete = false. > > ... > > } > > ... > > > > out: > > ... > > if (complete) => complete = false, > > so complete_all() never get executed. > > complete_all(&pending->done); > > > > return 0; > > } > > > > > > Review the code, we found that there are many places can change > > is_migration_disabled() value. > > (such as: __rt_spin_lock(), rt_read_lock(), rt_write_lock(), ...) > > > > Do you have any suggestion for this issue? > > Thank you. > > > > > > > > > > Best regards, > > Jing-Ting Wu > > > >