On 06/02, Paul Menage wrote: > > On Wed, Jun 2, 2010 at 1:58 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > >> The "it" that you're proposing to remove is in fact the code that > >> handles those races. > > > > In that case I confused, and I thought we already agreed that > > the PF_EXITING check in attach_task_by_pid() is not strictly needed > > for correctness. > > Not quite - something is required for correctness, and the PF_EXITING > check provides that correctness, with a very small window (between > setting PF_EXITING and calling cgroup_exit) where we might arguably > have been able to move the thread but decline to do so because it's > simpler not to do so and no-one cares. That's the optimization that I > meant - the data structures are slightly simpler since there's no way > to tell when a task has passed cgroup_exit(), and instead we just see > if they've passed PF_EXITING. > > > > > Once again, the task can call do_exit() and set PF_EXITING right > > after the check. > > Yes, the important part is that they haven't set it *before* the check > in attach_task_by_pid(). If they have set it before that, then they > could be anywhere in the exit path after PF_EXITING, and we decline to > move them since it's possible that they've already passed > cgroup_exit(). If the exiting task has not yet set PF_EXITING, then it > can't possibly get into the critical section in cgroup_exit() since > attach_task_by_pid() holds task->alloc_lock. It doesn't ? At least in Linus's tree. cgroup_attach_task() does, and this time PF_EXITING is understandable. Oleg. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers