The commit 74e4b956eb1c incorrectly wrapped kernfs_walk_and_get (might_sleep) under css_set_lock (spinlock). css_set_lock is needed by __cset_cgroup_from_root to ensure stable cset->cgrp_links. The returned cgroup object is pinned by the css_set (*). Because current cannot switch namespace asynchronously, the css_set is also pinned by ns_proxy->cgroup_ns (regardless of current's cgroup migration). Kernfs code that traverses paths with relative root_cgroup not need css_set_lock. (*) Except for root cgroups. The default hierarchy root (under which cgroup id and path resolution only happens) is eternal so it's moot. cgroup_show_path (VFS callback) is expected to be synchronized (**) wrt kill_sb (VFS callback) (mnt_namespace.list with namespace_sem). (**) If not, it's still an independent issue from this and the fixed one. Fixes: 74e4b956eb1c: ("cgroup: Honor caller's cgroup NS when resolving path") Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> Signed-off-by: Michal Koutný <mkoutny@xxxxxxxx> --- kernel/cgroup/cgroup.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) I considered adding get_cgroup() into current_cgns_cgroup_from_root to avoid reliance on the transitive pinning via css_set. After reasoning about no asynchronous NS switch and v1 hiearchies kill_sb it didn't seem to bring that much benefit (it didn't compose well with BUG_ON(!cgrp) neither). diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index e0b72eb5d283..8c9497f01332 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1391,11 +1391,16 @@ static void cgroup_destroy_root(struct cgroup_root *root) cgroup_free_root(root); } +/* + * Returned cgroup is without refcount but it's valid as long as cset pins it. + */ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset, struct cgroup_root *root) { struct cgroup *res_cgroup = NULL; + lockdep_assert_held(&css_set_lock); + if (cset == &init_css_set) { res_cgroup = &root->cgrp; } else if (root == &cgrp_dfl_root) { @@ -1426,8 +1431,6 @@ current_cgns_cgroup_from_root(struct cgroup_root *root) struct cgroup *res = NULL; struct css_set *cset; - lockdep_assert_held(&css_set_lock); - rcu_read_lock(); cset = current->nsproxy->cgroup_ns->root_cset; @@ -1446,7 +1449,6 @@ static struct cgroup *cset_cgroup_from_root(struct css_set *cset, struct cgroup *res = NULL; lockdep_assert_held(&cgroup_mutex); - lockdep_assert_held(&css_set_lock); res = __cset_cgroup_from_root(cset, root); @@ -1861,8 +1863,8 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, spin_lock_irq(&css_set_lock); ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot); - len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX); spin_unlock_irq(&css_set_lock); + len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX); if (len >= PATH_MAX) len = -ERANGE; @@ -6649,8 +6651,8 @@ struct cgroup *cgroup_get_from_path(const char *path) spin_lock_irq(&css_set_lock); root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root); - kn = kernfs_walk_and_get(root_cgrp->kn, path); spin_unlock_irq(&css_set_lock); + kn = kernfs_walk_and_get(root_cgrp->kn, path); if (!kn) goto out; base-commit: a8c52eba880a6e8c07fc2130604f8e386b90b763 -- 2.37.0