Re: [PATCH] mm, oom: don't invoke oom killer if current has been reapered

Yafang Shao <laoar.shao@xxxxxxxxx> · Tue, 14 Jul 2020 10:09:41 +0800

On Tue, Jul 14, 2020 at 3:05 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Mon 13-07-20 21:11:50, Yafang Shao wrote:
> > On Mon, Jul 13, 2020 at 8:45 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > >
> > > On Mon 13-07-20 20:24:07, Yafang Shao wrote:
> [...]
> > > > But we can't try locking the global oom_lock here, because task ooming
> > > > in memcg foo may can't help the tasks in memcg bar.
> > >
> > > I do not follow. oom_lock is not about fwd progress. It is a big lock to
> > > synchronize against oom_disable logic.
> > >
> > > I have this in mind
> > >
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 248e6cad0095..29d1f8c2d968 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -1563,8 +1563,10 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > >         };
> > >         bool ret;
> > >
> > > -       if (mutex_lock_killable(&oom_lock))
> > > +       if (!mutex_trylock(&oom_lock))
> > >                 return true;
> >
> >                    root_mem_cgroup
> >             /                                                          \
> > memcg_a (16G)                                             memcg_b (32G)
> > |                                                                        |
> > process a_1 (reach memcg_a limit)                process b_1(reach
> > memcg_b limit)
> > hold oom_lock                                                  wait oom_lock
> >
> > So we can find that process a_1 will try to kill process in memcg_a,
> > while process b_1 need to try to kill process in memcg_b.
> > IOW, the process killed in memcg_a can't help the processes in
> > memcg_b, so if process b_1 should not trylock oom_lock here.
> >
> > While if the memcg tree is ,
> >                    target mem_cgroup (16G)
> >             /                                                          \
> >             |
> >               |
> > process a_1 (reach memcg_a limit)                process a_2(reach
> > memcg_a limit)
> > hold oom_lock                                                  wait oom_lock
> >
> > Then, process a_2 can trylock oom_lock here. IOW, these processes
> > should in the same memcg.
> >
> > That's why I said that we should introduce per-memcg oom_lock.
>
> I still fail to understand your reaasoning. Sure, the oom lock is global
> so it doesn't have a per oom hierarchy resolution pretty much by definition.
> But that is not really important. The whole point of the trylock is to
> remove the ordering between the oom selection, the oom reaper and
> potential charge consumers which trigger the oom in parallel. With the
> blocking lock they would pile up in the order they have hit the OOM
> situation. With the trylock they would simply keep retrying until the
> oom is done. That would reduce the race window considerably. This is
> what the global oom is doing.
>

Thanks for the explanation.
Seems trylock can work.

> Another alternative would be to check mem_cgroup_margin after the lock
> is taken but it would be better to keep in sync with the global case as
> much as possible unless there is a good reason to differ.
>
> --
> Michal Hocko
> SUSE Labs

-- 
Thanks
Yafang