On 12/10/23 12:35, Waiman Long wrote: > On 10/11/23 08:54, Waiman Long wrote: ... > > We can argue that there can be racing between cgroup_exit() and the > > iteration of tasks in cpuset_attach() or cpuset_can_attach(). An > > rcu_read_lock() is probably needed. I am stilling investigating that. > > Cgroup has a rather complex task migration and iteration scheme. According > to the following comments in include/linux/cgroup-defs.h: > > /* > * Lists running through all tasks using this cgroup group. > * mg_tasks lists tasks which belong to this cset but are in the > * process of being migrated out or in. Protected by > * css_set_lock, but, during migration, once tasks are moved to > * mg_tasks, it can be read safely while holding cgroup_mutex. > */ > struct list_head tasks; > struct list_head mg_tasks; > struct list_head dying_tasks; > > I haven't fully figured out how that protection works yet. Assuming that is > the case, task iteration in cpuset_attach() should be fine since > cgroup_mutex is indeed held when it is invoked. That protection, however, > does not applied to nr_deadline_tasks. It may be too costly to acquire > cpuset_mutex before updating nr_deadline_tasks in cgroup_exit(). So changing > it to an atomic_t should be the easy way out of the potential racing > problem. My biggest perplexity is/was still about dl_rebuild_rd_accounting() and cgroup_exit(); I wonder if the latter, operating outside cpuset_mutex guard, might still be racy wrt the former (even if we change to atomic_t). However, looking again at it, dl_rebuild_rd_accounting() operates on css(es) via css_task_iter_start(), which grabs css_set_lock. So maybe we are OK already also for this case? Apologies for being pedantic, but we fought already several times with races around these bits and now I'm probably over-suspicious. :) Thanks, Juri