On Sun, Nov 13, 2011 at 05:44:32PM +0100, Frederic Weisbecker wrote: > On Tue, Nov 01, 2011 at 04:46:26PM -0700, Tejun Heo wrote: > > threadgroup_lock() protected only protected against new addition to > > the threadgroup, which was inherently somewhat incomplete and > > problematic for its only user cgroup. On-going migration could race > > against exec and exit leading to interesting problems - the symmetry > > between various attach methods, task exiting during method execution, > > ->exit() racing against attach methods, migrating task switching basic > > properties during exec and so on. > > > > This patch extends threadgroup_lock() such that it protects against > > all three threadgroup altering operations - fork, exit and exec. For > > exit, threadgroup_change_begin/end() calls are added to exit path. > > For exec, threadgroup_[un]lock() are updated to also grab and release > > cred_guard_mutex. > > > > With this change, threadgroup_lock() guarantees that the target > > threadgroup will remain stable - no new task will be added, no new > > PF_EXITING will be set and exec won't happen. > > > > The next patch will update cgroup so that it can take full advantage > > of this change. > > I don't want to nitpick really, but IMHO the races involved in exit and exec > are too different, specific and complicated on their own to be solved in a > single one patch. This should be split in two things. > > The specific problems and their fix need to be described more in detail > in the changelog because the issues are very tricky. > > The exec case: > > IIUC, the race in exec is about the group leader that can be changed > to become the exec'ing thread, making while_each_thread() unsafe. > We also have other things happening there like all the other threads > in the group that get killed, but that should be handled by the threadgroup_change_begin() > you put in the exit path. > The old leader is also killed but release_task() -> __unhash_process() is called > for it manually from the exec path. However this thread too should be covered by your > synchronisation in exit(). > > So after your protection in the exit path, the only thing to protect against in exec > is that group_leader that can change concurrently. But I may be missing something in the picture. > Also note this is currently protected by the tasklist readlock. Cred guard mutex is > certainly better, I just don't remember if you remove the tasklist lock in a > further patch. Ah recalling what Ben Blum said, we also need the leader to stay stable because it is excpected to be passed in ->can_attach(), ->attach(), ->cancel_attach(), ... Although that's going to change after your patches that pass a flex array. _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/linux-pm