>> Subject: [PATCH 1/2] cgroup: move module ref handling into rebind_subsystems() >> >> Module ref handling in cgroup is rather weird. >> parse_cgroupfs_options() grabs all the modules for the specified >> subsystems. A module ref is kept if the specified subsystem is newly >> bound to the hierarchy. If not, or the operation fails, the refs are >> dropped. This scatters module ref handling across multiple functions >> making it difficult to track. It also make the function nasty to use >> for dynamic subsystem binding which is necessary for the planned >> unified hierarchy. >> >> There's nothing which requires the subsystem modules to be pinned >> between parse_cgroupfs_options() and rebind_subsystems() in both mount >> and remount paths. parse_cgroupfs_options() can just parse and >> rebind_subsystems() can handle pinning the subsystems that it wants to >> bind, which is a natural part of its task - binding - anyway. >> >> Move module ref handling into rebind_subsystems() which makes the code >> a lot simpler - modules are gotten iff it's gonna be bound and put iff >> unbound or binding fails. >> >> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> >> --- >> kernel/cgroup.c | 87 +++++++++++++++------------------------------------------ >> 1 file changed, 22 insertions(+), 65 deletions(-) >> >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c >> index 3bc7a1a..a65aff1 100644 >> --- a/kernel/cgroup.c >> +++ b/kernel/cgroup.c >> @@ -1003,6 +1003,7 @@ static int rebind_subsystems(struct cgroupfs_root *root, >> { >> struct cgroup *cgrp = &root->top_cgroup; >> struct cgroup_subsys *ss; >> + unsigned long pinned = 0; >> int i, ret; >> >> BUG_ON(!mutex_is_locked(&cgroup_mutex)); >> @@ -1010,20 +1011,26 @@ static int rebind_subsystems(struct cgroupfs_root *root, >> >> /* Check that any added subsystems are currently free */ >> for_each_subsys(ss, i) { >> - unsigned long bit = 1UL << i; >> - >> - if (!(bit & added_mask)) >> + if (!(added_mask & (1 << i))) >> continue; >> >> + /* is the subsystem mounted elsewhere? */ >> if (ss->root != &cgroup_dummy_root) { >> - /* Subsystem isn't free */ >> - return -EBUSY; >> + ret = -EBUSY; >> + goto out_put; >> } >> + >> + /* pin the module */ >> + if (!try_module_get(ss->module)) { >> + ret = -ENOENT; >> + goto out_put; >> + } >> + pinned |= 1 << i; >> } > > This looks wrong to me. > > cgroup_mount() > { > mutex_lock(cgroup_mutex); > parse_cgroupfs_options(); > mutex_unlock(cgroup_mutex); > ... > > mutex_lock(cgroup_mutex); > ... > rebind_subsystems(); > ... > mutex_unlock(cgroup_mutex); > } > > so a modular cgroup subsystem can be unloaded inbetween, say it's net_cls, and > then it's possible that: > > # mount -t cgroup -o net_cls xxx /cgroup > > The above operation succeeds but it's not binded to cgroupfs as it just got > unloaded. > for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) { ... if (!subsys[i] && (added_mask & (1 << i)) return -EINVAL; ... } This should work. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers