On Wed 19-04-23 03:07:39, Haifeng Xu wrote: > Before commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to > the charge path"), all memcg oom killers were delayed to page fault > path. And the explicit wakeup is used in this case: > > thread A: > ... > if (locked) { // complete oom-kill, hold the lock > mem_cgroup_oom_unlock(memcg); > ... > } > ... > > thread B: > ... > > if (locked && !memcg->oom_kill_disable) { > ... > } else { > schedule(); // can't acquire the lock > ... > } > ... > > The reason is that thread A kicks off the OOM-killer, which leads to > wakeups from the uncharges of the exiting task. But thread B is not > guaranteed to see them if it enters the OOM path after the OOM kills > but before thread A releases the lock. > > Now only oom_kill_disable case is handled from the #PF path. In that > case it is userspace to trigger the wake up not the #PF path itself. > All potential paths to free some charges are responsible to call > memcg_oom_recover() , so the explicit wakeup is not needed in the > mem_cgroup_oom_synchronize() path which doesn't release any memory > itself. > > Signed-off-by: Haifeng Xu <haifeng.xu@xxxxxxxxxx> > Suggested-by: Michal Hocko <mhocko@xxxxxxxx> I hope I haven't missed anything but this looks good to me. One way to test this would be a parallel OOM triggering workload which uses page faults and an automatic user space driven oom killer (detect under_oom and send SIGKILL to a random task after a random timeout). Objective is that no task gets stuck in mem_cgroup_oom_synchronize. I am pretty sure this could be easily turned into a selftest. Acked-by: Michal Hocko <mhocko@xxxxxxxx> > --- > v2: split original into two and improve patch description > --- > mm/memcontrol.c | 9 +-------- > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index fbf4d2bb1003..710ce3e7824f 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2003,15 +2003,8 @@ bool mem_cgroup_oom_synchronize(bool handle) > mem_cgroup_unmark_under_oom(memcg); > finish_wait(&memcg_oom_waitq, &owait.wait); > > - if (locked) { > + if (locked) > mem_cgroup_oom_unlock(memcg); > - /* > - * There is no guarantee that an OOM-lock contender > - * sees the wakeups triggered by the OOM kill > - * uncharges. Wake any sleepers explicitly. > - */ > - memcg_oom_recover(memcg); > - } > cleanup: > current->memcg_in_oom = NULL; > css_put(&memcg->css); > -- > 2.25.1 -- Michal Hocko SUSE Labs