On Mon 25-11-19 20:17:15, Yafang Shao wrote: > On Mon, Nov 25, 2019 at 7:54 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > On Mon 25-11-19 19:37:59, Yafang Shao wrote: > > > On Mon, Nov 25, 2019 at 7:08 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > > > On Mon 25-11-19 05:14:53, Yafang Shao wrote: > > > > > We set memory.oom.group to make all processes in this memcg are killed by > > > > > OOM killer to free more pages. In this case, it doesn't make sense to > > > > > protect the pages with memroy.{min, low} again if they are set. > > > > > > > > I do not see why? What does group OOM killing has anything to do with > > > > the reclaim protection? What is the actual problem you are trying to > > > > solve? > > > > > > > > > > The cgroup is treated as a indivisible workload when cgroup.oom.group > > > is set and OOM killer is trying to kill a prcess in this cgroup. > > > > Yes this is true. > > > > > We set cgroup.oom.group is to guarantee the workload integrity, now > > > that processes ara all killed, why keeps the page cache here? > > > > Because an administrator has configured the reclaim protection in a > > certain way and hopefully had a good reason to do that. We are not going > > to override that configure just because there is on OOM killer invoked > > and killed tasks in that memcg. The workload might get restarted and it > > would run under a different constrains all of the sudden which is not > > expected. > > > > In short kernel should never silently change the configuration made by > > an admistrator. > > Understood. > > So what about bellow changes ? We don't override the admin setting, > but we reclaim the page caches from it if this memcg is oom killed. > Something like, > > mem_cgroup_protected > { > ... > + if (!cgroup_is_populated(memcg->css.cgroup) && > mem_cgroup_under_oom_group_kill(memcg)) > + return MEMCG_PROT_NONE; > + > usage = page_counter_read(&memcg->memory); > if (!usage) > return MEMCG_PROT_NONE; > } I assume that mem_cgroup_under_oom_group_kill is essentially memcg->under_oom && memcg->oom_group But that doesn't really help much because all the reclaim attempts have been already attempted and failed. I do not remember exact details about under_oom but I have a recollection that it wouldn't really work for cgroup v2 because the oom_control is not in place and so the state would be set for only very short time period. Again, what is a problem that you are trying to fix? -- Michal Hocko SUSE Labs