On Thu, Jan 20, 2022 at 03:14:22PM +0800, Zhang Qiao <zhangqiao22@xxxxxxxxxx> wrote: > i think the troublesome scenario as follows: > cpuset_can_attach > down_read(cpuset_rwsem) > // check all migratees > up_read(cpuset_rwsem) > [ _cpu_down / cpuhp_setup_state ] > cpuset_attach > down_write(cpuset_rwsem) > guarantee_online_cpus() // (load cpus_attach) > sched_cpu_deactivate > set_cpu_active(cpu, false) // will change cpu_active_mask > set_cpus_allowed_ptr(cpus_attach) > __set_cpus_allowed_ptr_locked() > // (if the intersection of cpus_attach and > cpu_active_mask is empty, will return -EINVAL) > up_write(cpuset_rwsem) > schedule_work > ... > cpuset_hotplug_update_tasks > down_write(cpuset_rwsem) > up_write(cpuset_rwsem) > ... flush_work > [ _cpu_down / cpu_up_down_serialize_trainwrecks ] Thanks, a locking loophole indeed. FTR, meanwhile I noticed: a) cpuset_fork() looks buggy when CLONE_INTO_CGROUP (and dst.cpus != src.cpus), b) it'd be affected with similar hotplug race. Michal