On Tue 21-05-13 10:50:21, Tejun Heo wrote: > When cgroup_next_descendant_pre() initiates a walk, it checks whether > the subtree root doesn't have any children and if not returns NULL. > Later code assumes that the subtree isn't empty. This is broken > because the subtree may become empty inbetween, which can lead to the > traversal escaping the subtree by walking to the sibling of the > subtree root. > > There's no reason to have the early exit path. Remove it along with > the later assumption that the subtree isn't empty. This simplifies > the code a bit and fixes the subtle bug. > > While at it, fix the comment of cgroup_for_each_descendant_pre() which > was incorrectly referring to ->css_offline() instead of > ->css_online(). > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx Well spotted and looks good to me Reviewed-by: Michal Hocko <mhocko@xxxxxxx> > --- > include/linux/cgroup.h | 2 +- > kernel/cgroup.c | 9 +++------ > 2 files changed, 4 insertions(+), 7 deletions(-) > > diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h > index 4f6f513..1df5f69 100644 > --- a/include/linux/cgroup.h > +++ b/include/linux/cgroup.h > @@ -709,7 +709,7 @@ struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos); > * > * If a subsystem synchronizes against the parent in its ->css_online() and > * before starting iterating, and synchronizes against @pos on each > - * iteration, any descendant cgroup which finished ->css_offline() is > + * iteration, any descendant cgroup which finished ->css_online() is > * guaranteed to be visible in the future iterations. > * > * In other words, the following guarantees that a descendant can't escape > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index 6b2b1d9..f20f80c 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -2990,11 +2990,8 @@ struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos, > WARN_ON_ONCE(!rcu_read_lock_held()); > > /* if first iteration, pretend we just visited @cgroup */ > - if (!pos) { > - if (list_empty(&cgroup->children)) > - return NULL; > + if (!pos) > pos = cgroup; > - } > > /* visit the first child if exists */ > next = list_first_or_null_rcu(&pos->children, struct cgroup, sibling); > @@ -3002,14 +2999,14 @@ struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos, > return next; > > /* no child, visit my or the closest ancestor's next sibling */ > - do { > + while (pos != cgroup) { > next = list_entry_rcu(pos->sibling.next, struct cgroup, > sibling); > if (&next->sibling != &pos->parent->children) > return next; > > pos = pos->parent; > - } while (pos != cgroup); > + } > > return NULL; > } > -- > 1.8.1.4 > -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html