On Mon, Aug 3, 2009 at 10:54 AM, Serge E. Hallyn<serue@xxxxxxxxxx> wrote: >> + if (guarantee) { >> + /* >> + * our caller promises us that the css_set we want already >> + * exists, so we use find_existing_css_set directly. >> + */ >> + struct cgroup_subsys_state *template[CGROUP_SUBSYS_COUNT]; >> + read_lock(&css_set_lock); >> + newcg = find_existing_css_set(oldcg, cgrp, template); >> + BUG_ON(!newcg); >> + get_css_set(newcg); >> + read_unlock(&css_set_lock); >> + } else { >> + might_sleep(); > > So cgroup_task_migrate() might sleep, but > might sleep if guarantee==0 >> + retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, 1); > > Here it is called under rcu_read_lock(). With guarantee==1 > > What *exactly* is it we are protecting with cgroup_fork_mutex? > 'fork' (as the name implies) is not a good answer, since we should be > protecting data, not code. If it is solely tsk->cgroups, then perhaps > we should in fact try switching to (s?)rcu. Then cgroup_fork() could > just do rcu_read_lock, while cgroup_task_migrate() would make the change > under a spinlock (to protect against concurrent cgroup_task_migrate()s), > and using rcu_assign_pointer to let cgroup_fork() see consistent data > either before or after the update... That might mean that any checks done > before completing the migrate which involve the # of tasks might become > invalidated before the migration completes? What's being protected is the ability to move an entire thread group to the new destination cgroup, even in the presence of concurrent thread clone operations. New threads aren't visible to other threads until the point when they're attached to the tasklist, so if any concurrent do_fork() operation is somewhere between the call to cgroup_fork() and the attachment to the tasklist when the "attach proc to new cgroup" operation occurs, it may get left behind. Paul _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers