On Mon, Nov 25, 2019 at 7:54 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Mon 25-11-19 19:37:59, Yafang Shao wrote: > > On Mon, Nov 25, 2019 at 7:08 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > On Mon 25-11-19 05:14:53, Yafang Shao wrote: > > > > We set memory.oom.group to make all processes in this memcg are killed by > > > > OOM killer to free more pages. In this case, it doesn't make sense to > > > > protect the pages with memroy.{min, low} again if they are set. > > > > > > I do not see why? What does group OOM killing has anything to do with > > > the reclaim protection? What is the actual problem you are trying to > > > solve? > > > > > > > The cgroup is treated as a indivisible workload when cgroup.oom.group > > is set and OOM killer is trying to kill a prcess in this cgroup. > > Yes this is true. > > > We set cgroup.oom.group is to guarantee the workload integrity, now > > that processes ara all killed, why keeps the page cache here? > > Because an administrator has configured the reclaim protection in a > certain way and hopefully had a good reason to do that. We are not going > to override that configure just because there is on OOM killer invoked > and killed tasks in that memcg. The workload might get restarted and it > would run under a different constrains all of the sudden which is not > expected. > > In short kernel should never silently change the configuration made by > an admistrator. Understood. So what about bellow changes ? We don't override the admin setting, but we reclaim the page caches from it if this memcg is oom killed. Something like, mem_cgroup_protected { ... + if (!cgroup_is_populated(memcg->css.cgroup) && mem_cgroup_under_oom_group_kill(memcg)) + return MEMCG_PROT_NONE; + usage = page_counter_read(&memcg->memory); if (!usage) return MEMCG_PROT_NONE; }