On Thu, Mar 08, 2012 at 04:45:13PM +0800, Li Zefan wrote: > A similar bug exists in cpuset, and those are long-standing bugs. > > As reported by Frederic: > > > When a user freezes a cgroup, the freezer sets the subsystem state > > to CGROUP_FREEZING and then iterates over the tasks in the cgroup links. > > > > But there is a possible race here, although unlikely, if a task > > forks and the parent is preempted between write_unlock(tasklist_lock) > > and cgroup_post_fork(). If we freeze the cgroup while the parent > > is sleeping and the parent wakes up thereafter, its child will > > be missing from the set of tasks to freeze because: > > > > - The child was not yet linked to its css_set->tasks, as is done > > from cgroup_post_fork(). cgroup_iter_start() has thus missed it. > > > > - The cgroup freezer's fork callback can handle that child but > > cgroup_fork_callbacks() has been called already. > > I try to fix it by using seqcount. We read the counter before calling > cgroup_fork_callbacks(), and we check the counter after cgroup_post_fork(). > If the seq numbers don't match, we know the forking task's cgroup > has been/is being frozen, so we freeze the child task. > > cpuset can be fixed accordingly. > > Reported-by: Frederic Weisbecker <fweisbec@xxxxxxxxx> > Signed-off-by: Li Zefan <lizf@xxxxxxxxxxxxxx> I feel we are a bit stuck here. All these complications come from the fact we are conditionally setting this css_set link. I wish we could set it unconditionally on cgroup_fork() time. This unfortunately implies at least locking the css_set and to do a list_add() unconditionally. And at times where cgroup is often critisized for the overhead it involves, I guess this is not welcome. This ->post_fork() based solution is not pretty, unfortunately I can't come with a better idea. -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html