On 2013/6/29 12:12, Tejun Heo wrote: > Hello, > > These two patches are on top of > > cgroup/for-3.11 0ce6cba35 ("cgroup: CGRP_ROOT_SUBSYS_BOUND should be ignored when comparing mount options") > + "cgroup: fix and clean up cgroup file creations and removals" patchset > http://thread.gmane.org/gmane.linux.kernel.cgroups/8245 > > and available in the following git branch > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-module-ref > > Both patchsets I'm posting today are too late for for-3.11 unless > there's gonna be -rc8. I'll collect the reviews and apply these after > -rc1 drops. > > Thanks. > > ---------------------------------- 8< ---------------------------------- >>From e8645bf68bcada3e07538fd042e9f0ce8a1e72cd Mon Sep 17 00:00:00 2001 > From: Tejun Heo <tj@xxxxxxxxxx> > Date: Fri, 28 Jun 2013 21:08:27 -0700 > Subject: [PATCH 1/2] cgroup: move module ref handling into rebind_subsystems() > > Module ref handling in cgroup is rather weird. > parse_cgroupfs_options() grabs all the modules for the specified > subsystems. A module ref is kept if the specified subsystem is newly > bound to the hierarchy. If not, or the operation fails, the refs are > dropped. This scatters module ref handling across multiple functions > making it difficult to track. It also make the function nasty to use > for dynamic subsystem binding which is necessary for the planned > unified hierarchy. > > There's nothing which requires the subsystem modules to be pinned > between parse_cgroupfs_options() and rebind_subsystems() in both mount > and remount paths. parse_cgroupfs_options() can just parse and > rebind_subsystems() can handle pinning the subsystems that it wants to > bind, which is a natural part of its task - binding - anyway. > > Move module ref handling into rebind_subsystems() which makes the code > a lot simpler - modules are gotten iff it's gonna be bound and put iff > unbound or binding fails. > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > --- > kernel/cgroup.c | 87 +++++++++++++++------------------------------------------ > 1 file changed, 22 insertions(+), 65 deletions(-) > > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index 3bc7a1a..a65aff1 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -1003,6 +1003,7 @@ static int rebind_subsystems(struct cgroupfs_root *root, > { > struct cgroup *cgrp = &root->top_cgroup; > struct cgroup_subsys *ss; > + unsigned long pinned = 0; > int i, ret; > > BUG_ON(!mutex_is_locked(&cgroup_mutex)); > @@ -1010,20 +1011,26 @@ static int rebind_subsystems(struct cgroupfs_root *root, > > /* Check that any added subsystems are currently free */ > for_each_subsys(ss, i) { > - unsigned long bit = 1UL << i; > - > - if (!(bit & added_mask)) > + if (!(added_mask & (1 << i))) > continue; > > + /* is the subsystem mounted elsewhere? */ > if (ss->root != &cgroup_dummy_root) { > - /* Subsystem isn't free */ > - return -EBUSY; > + ret = -EBUSY; > + goto out_put; > } > + > + /* pin the module */ > + if (!try_module_get(ss->module)) { > + ret = -ENOENT; > + goto out_put; > + } > + pinned |= 1 << i; > } This looks wrong to me. cgroup_mount() { mutex_lock(cgroup_mutex); parse_cgroupfs_options(); mutex_unlock(cgroup_mutex); ... mutex_lock(cgroup_mutex); ... rebind_subsystems(); ... mutex_unlock(cgroup_mutex); } so a modular cgroup subsystem can be unloaded inbetween, say it's net_cls, and then it's possible that: # mount -t cgroup -o net_cls xxx /cgroup The above operation succeeds but it's not binded to cgroupfs as it just got unloaded. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers