Hi, Tejun We saw some kernel null pointer dereference in cgroup_pidlist_destroy_work_fn(), more precisely at __mutex_lock_slowpath(), on 3.14. I can show you the full stack trace on request. Looking at the code, it seems flush_workqueue() doesn't care about new incoming works, it only processes currently pending ones, if this is correct, then we could have the following race condition: cgroup_pidlist_destroy_all(): //... mutex_lock(&cgrp->pidlist_mutex); list_for_each_entry_safe(l, tmp_l, &cgrp->pidlists, links) mod_delayed_work(cgroup_pidlist_destroy_wq, &l->destroy_dwork, 0); mutex_unlock(&cgrp->pidlist_mutex); // <--- another process calls cgroup_pidlist_start() here since mutex is released flush_workqueue(cgroup_pidlist_destroy_wq); // <--- another process adds new pidlist and queue work in pararell BUG_ON(!list_empty(&cgrp->pidlists)); // <--- This check is passed, list_add() could happen after this Therefore, the newly added pidlist will point to a freed cgroup, and when it is freed in the delayed work we will crash. The attached patch (compile test ONLY) could be a possible fix, since it will check and hold a refcount on this cgroup in cgroup_pidlist_start(). But I could very easily miss something here since there are many cgroup changes after 3.14 and I don't follow cgroup development. What do you think? Thanks.
diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 940aced..2206151 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4084,6 +4084,9 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos) int index = 0, pid = *pos; int *iter, ret; + if (!cgroup_tryget(cgrp)) + return NULL; + mutex_lock(&cgrp->pidlist_mutex); /* @@ -4132,13 +4135,15 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos) static void cgroup_pidlist_stop(struct seq_file *s, void *v) { + struct cgroup *cgrp = seq_css(s)->cgroup; struct kernfs_open_file *of = s->private; struct cgroup_pidlist *l = of->priv; if (l) mod_delayed_work(cgroup_pidlist_destroy_wq, &l->destroy_dwork, CGROUP_PIDLIST_DESTROY_DELAY); - mutex_unlock(&seq_css(s)->cgroup->pidlist_mutex); + mutex_unlock(&cgrp->pidlist_mutex); + cgroup_put(cgrp); } static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos)