On Fri, Jan 14, 2022 at 09:15:06AM +0800, Zhang Qiao <zhangqiao22@xxxxxxxxxx> wrote: > I found the following warning log on qemu. I migrated a task from one cpuset cgroup to > another, while I also performed the cpu hotplug operation, and got following calltrace. Do you have more information on what hotplug event and what error (from set_cpus_allowed_ptr() you observe? (And what's src/dst cpuset wrt root/non-root)? > Can we use cpus_read_lock()/cpus_read_unlock() to guarantee that set_cpus_allowed_ptr() > doesn't fail, as follows: I'm wondering what can be wrong with the current actors: cpuset_can_attach down_read(cpuset_rwsem) // check all migratees up_read(cpuset_rwsem) [ _cpu_down / cpuhp_setup_state ] schedule_work ... cpuset_hotplug_update_tasks down_write(cpuset_rwsem) up_write(cpuset_rwsem) ... flush_work [ _cpu_down / cpu_up_down_serialize_trainwrecks ] cpuset_attach down_write(cpuset_rwsem) set_cpus_allowed_ptr(allowed_cpus_weird) up_write(cpuset_rwsem) The statement in cpuset_attach() about cpuset_can_attach() test is not so strong since task_can_attach() is mostly a pass for non-deadline tasks. Still, the use of cpuset_rwsem above should synchronize (I may be mistaken) the changes of cpuset's cpu masks, so I'd be interested about the details above to understand why the current approach doesn't work. The additional cpus_read_{,un}lock (when reordered wrt cpuset_rwsem) may work but your patch should explain why (in what situation). My .02€, Michal