Kernel crash in cgroup_pidlist_destroy_work_fn()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Tejun


We saw some kernel null pointer dereference in
cgroup_pidlist_destroy_work_fn(), more precisely at
__mutex_lock_slowpath(), on 3.14. I can show you the full stack trace
on request.

Looking at the code, it seems flush_workqueue() doesn't care about new
incoming works, it only processes currently pending ones, if this is
correct, then we could have the following race condition:

cgroup_pidlist_destroy_all():
        //...
        mutex_lock(&cgrp->pidlist_mutex);
        list_for_each_entry_safe(l, tmp_l, &cgrp->pidlists, links)
                mod_delayed_work(cgroup_pidlist_destroy_wq,
&l->destroy_dwork, 0);
        mutex_unlock(&cgrp->pidlist_mutex);

        // <--- another process calls cgroup_pidlist_start() here
since mutex is released

        flush_workqueue(cgroup_pidlist_destroy_wq); // <--- another
process adds new pidlist and queue work in pararell
        BUG_ON(!list_empty(&cgrp->pidlists)); // <--- This check is
passed, list_add() could happen after this


Therefore, the newly added pidlist will point to a freed cgroup, and
when it is freed in the delayed work we will crash.

The attached patch (compile test ONLY) could be a possible fix, since
it will check and hold a refcount on this cgroup in
cgroup_pidlist_start(). But I could very easily miss something here
since there are many cgroup changes after 3.14 and I don't follow
cgroup development.

What do you think?

Thanks.
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 940aced..2206151 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4084,6 +4084,9 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos)
 	int index = 0, pid = *pos;
 	int *iter, ret;
 
+	if (!cgroup_tryget(cgrp))
+		return NULL;
+
 	mutex_lock(&cgrp->pidlist_mutex);
 
 	/*
@@ -4132,13 +4135,15 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos)
 
 static void cgroup_pidlist_stop(struct seq_file *s, void *v)
 {
+	struct cgroup *cgrp = seq_css(s)->cgroup;
 	struct kernfs_open_file *of = s->private;
 	struct cgroup_pidlist *l = of->priv;
 
 	if (l)
 		mod_delayed_work(cgroup_pidlist_destroy_wq, &l->destroy_dwork,
 				 CGROUP_PIDLIST_DESTROY_DELAY);
-	mutex_unlock(&seq_css(s)->cgroup->pidlist_mutex);
+	mutex_unlock(&cgrp->pidlist_mutex);
+	cgroup_put(cgrp);
 }
 
 static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos)

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux