Hi Tetsuo, On Sun, Jan 27, 2019 at 11:57:38PM +0900, Tetsuo Handa wrote: > From 9c9e935fc038342c48461aabca666f1b544e32b1 Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> > Date: Sun, 27 Jan 2019 23:51:37 +0900 > Subject: [PATCH v3] oom, oom_reaper: do not enqueue same task twice > > Arkadiusz reported that enabling memcg's group oom killing causes > strange memcg statistics where there is no task in a memcg despite > the number of tasks in that memcg is not 0. It turned out that there > is a bug in wake_oom_reaper() which allows enqueuing same task twice > which makes impossible to decrease the number of tasks in that memcg > due to a refcount leak. > > This bug existed since the OOM reaper became invokable from > task_will_free_mem(current) path in out_of_memory() in Linux 4.7, > but memcg's group oom killing made it easier to trigger this bug by > calling wake_oom_reaper() on the same task from one out_of_memory() > request. This changelog seems a little terse compared to how tricky this is. Can you please include an explanation here *how* this bug is possible? I.e. the race condition that causes the function te be entered twice and the existing re-entrance check in there to fail. > Fix this bug using an approach used by commit 855b018325737f76 > ("oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task"). > As a side effect of this patch, this patch also avoids enqueuing > multiple threads sharing memory via task_will_free_mem(current) path.