On Mon, 27 Dec 2010 05:12:28 -0500 Ben Blum <bblum@xxxxxxxxxxxxxx> wrote: > > > > About nodemask, I feel what's required is pre_pre_attach() to cache required memory > > > > before attaching as radix_tree_preload(). Then, subsys can prepare for creating > > > > working area. > > > > > > To save on global memory footprint, we could pre-allocate two nodemasks, > > > but I feel like it's not worth the increase in code complexity. This > > > would need to be done in the other cases that unsafely do NODEMASK_ALLOC > > > too... too much to keep track of for little gain. > > > > > > > But NODEMASK_ALLOC cannot be on stack when we consider 4096node systems. > > Hence global allocation, instead of on-stack. (Also, for this particular > case, the state of the nodemasks needs to persist from pre_attach to > attach_task to attach, so it can't be static inside the function > either.) > Yes, it can be global/static. > > > > Hmm...but I wonder de_thread() should take threadgroup_fork_write_unlock(). > > > > > > > > I may not understand anything important but I feel taking tasklist_lock() is overkill. > > > > > > Would rcu_read_lock() be any better? Everywhere else in the kernel that > > > iterates over all threads in a group uses either rcu_read_lock or > > > tasklist_lock. > > > > > > > Because it means that can_attach()/attach() cannot sleep, it seems to make no > > difference. > > can_attach_task(), pre_attach(), and attach_task() cannot sleep, but > can_attach() and attach() may. Careful not to confuse them. :P > > > I wonder.... if you stops all clone()/fork() of the proc in moving, you can use > > find_ge_pid(). Please see next_tgid() or next_tid() in fs/proc/base.c which > > implements scanning tasklist with sleep. Can't you use next_tid() ? > > If I'm not mistaken, that approach is vulnerable to an exit() race - > next_tid() may return NULL if pid_alive() fails, and then we stop > iterating and miss some threads. (The relevant code is kernel/exit.c, > __unhash_process, which is protected by tasklist_lock and sighand->lock, > but nothing else.) > You can use pid array, find_ge_pid() (or some), if the linked-list is not dependable. If CLONE_THREAD is blocked, it works to catch all threads without races. Thanks, -Kame _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers