[PATCH] memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event

brookxu <brookxu.cn@xxxxxxxxx> · Thu, 5 Mar 2020 13:52:03 +0800

One eventfd monitors multiple memory thresholds of cgroup, closing it, the
system will delete related events. Before all events are deleted, another
eventfd monitors the cgroup's memory threshold.

As a result, thresholds->primary[] is not empty, but thresholds->sparse[]
is NULL, __mem_cgroup_usage_unregister_event() leading to a crash:

[  138.925809] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[  138.926817] IP: [<ffffffff8116c9b7>] mem_cgroup_usage_unregister_event+0xd7/0x1f0
[  138.927701] PGD 73bce067 PUD 76ff3067 PMD 0
[  138.928384] Oops: 0002 [#1] SMP
[  138.935218] CPU: 1 PID: 14 Comm: kworker/1:0 Not tainted 3.10.107-1-tlinux2-0047 #1
[  138.936076] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  138.936988] Workqueue: events cgroup_event_remove
[  138.937581] task: ffff88007c07e440 ti: ffff88007c090000 task.ti: ffff88007c090000
[  138.938485] RIP: 0010:[<ffffffff8116c9b7>]  [<ffffffff8116c9b7>] mem_cgroup_usage_unregister_event+0xd7/0x1f0
[  138.940116] RSP: 0018:ffff88007c093dc0  EFLAGS: 00010202
[  138.941056] RAX: 0000000000000001 RBX: ffff880073b3e1a8 RCX: 0000000000000001
[  138.942095] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880074519900
[  138.943129] RBP: ffff88007c093df0 R08: 0000000000000001 R09: 0000000000000000
[  138.946057] R10: 000000000000b95b R11: 0000000000000001 R12: ffff880076cc0480
[  138.947805] R13: ffff880073b3e1d0 R14: 0000000000000000 R15: 0000000000000000
[  138.948903] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[  138.952264] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  138.953123] CR2: 0000000000000004 CR3: 00000000753b3000 CR4: 00000000000406e0
[  138.954110] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  138.963245] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  138.964088] Stack:
[  138.964456]  0000000000000246 ffff880076d6df68 ffff8800751b4c00 ffff880076d6df00
[  138.965650]  0000000000000040 ffff880076d6df68 ffff88007c093e18 ffffffff810b17ba
[  138.966803]  ffff88007d04cf80 ffff88007fd115c0 ffff88007fd15600 ffff88007c093e60
[  138.968179] Call Trace:
[  138.968592]  [<ffffffff810b17ba>] cgroup_event_remove+0x3a/0x80
[  138.969321]  [<ffffffff81066387>] process_one_work+0x177/0x450
[  138.970051]  [<ffffffff8106721b>] worker_thread+0x11b/0x390
[  138.970741]  [<ffffffff81067100>] ? manage_workers.isra.26+0x290/0x290
[  138.971612]  [<ffffffff8106dacf>] kthread+0xcf/0xe0
[  138.972340]  [<ffffffff8106da00>] ? insert_kthread_work+0x40/0x40
[  138.973142]  [<ffffffff81aad9f8>] ret_from_fork+0x58/0x90
[  138.973843]  [<ffffffff8106da00>] ? insert_kthread_work+0x40/0x40

The solution is to check whether the thresholds associated with the eventfd
has been cleared when deleting the event. If so, we do nothing.

Signed-off-by: Chunguang Xu <brookxu@xxxxxxxxxxx>
---
 mm/memcontrol.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d09776c..4575a58 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4027,7 +4027,7 @@ static void __mem_cgroup_usage_unregister_event(struct mem_cgroup *memcg,
     struct mem_cgroup_thresholds *thresholds;
     struct mem_cgroup_threshold_ary *new;
     unsigned long usage;
-    int i, j, size;
+    int i, j, size, entries;
 
     mutex_lock(&memcg->thresholds_lock);
 
@@ -4047,12 +4047,18 @@ static void __mem_cgroup_usage_unregister_event(struct mem_cgroup *memcg,
     __mem_cgroup_threshold(memcg, type == _MEMSWAP);
 
     /* Calculate new number of threshold */
-    size = 0;
+    size = entries = 0;
     for (i = 0; i < thresholds->primary->size; i++) {
         if (thresholds->primary->entries[i].eventfd != eventfd)
             size++;
+        else
+            entries++;
     }
 
+    /* If items related to eventfd have been cleared, nothing to do */
+    if (!entries)
+        goto unlock;
+
     new = thresholds->spare;
 
     /* Set thresholds array to NULL if we don't have thresholds */
-- 
1.8.3.1