Re: [PATCH] mm, memcg: clear page protection when memcg oom group happens

Yafang Shao <laoar.shao@xxxxxxxxx> · Mon, 25 Nov 2019 20:17:15 +0800

On Mon, Nov 25, 2019 at 7:54 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Mon 25-11-19 19:37:59, Yafang Shao wrote:
> > On Mon, Nov 25, 2019 at 7:08 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > >
> > > On Mon 25-11-19 05:14:53, Yafang Shao wrote:
> > > > We set memory.oom.group to make all processes in this memcg are killed by
> > > > OOM killer to free more pages. In this case, it doesn't make sense to
> > > > protect the pages with memroy.{min, low} again if they are set.
> > >
> > > I do not see why? What does group OOM killing has anything to do with
> > > the reclaim protection? What is the actual problem you are trying to
> > > solve?
> > >
> >
> > The cgroup is treated as a indivisible  workload when cgroup.oom.group
> > is set and OOM killer is trying to kill a prcess in this cgroup.
>
> Yes this is true.
>
> > We set cgroup.oom.group is to  guarantee the workload integrity, now
> > that processes ara all killed, why keeps the page cache here?
>
> Because an administrator has configured the reclaim protection in a
> certain way and hopefully had a good reason to do that. We are not going
> to override that configure just because there is on OOM killer invoked
> and killed tasks in that memcg. The workload might get restarted and it
> would run under a different constrains all of the sudden which is not
> expected.
>
> In short kernel should never silently change the configuration made by
> an admistrator.

Understood.

So what about bellow changes ? We don't override the admin setting,
but we reclaim the page caches from it if this memcg is oom killed.
Something like,

mem_cgroup_protected
{
...
+       if (!cgroup_is_populated(memcg->css.cgroup) &&
mem_cgroup_under_oom_group_kill(memcg))
+               return MEMCG_PROT_NONE;
+
        usage = page_counter_read(&memcg->memory);
        if (!usage)
                return MEMCG_PROT_NONE;
}