On Tue 17-07-18 13:06:42, Roman Gushchin wrote: > On Tue, Jul 17, 2018 at 09:49:46PM +0200, Michal Hocko wrote: > > On Tue 17-07-18 10:38:45, Roman Gushchin wrote: > > [...] > > > Let me show my proposal on examples. Let's say we have the following hierarchy, > > > and the biggest process (or the process with highest oom_score_adj) is in D. > > > > > > / > > > | > > > A > > > | > > > B > > > / \ > > > C D > > > > > > Let's look at different examples and intended behavior: > > > 1) system-wide OOM > > > - default settings: the biggest process is killed > > > - D/memory.group_oom=1: all processes in D are killed > > > - A/memory.group_oom=1: all processes in A are killed > > > 2) memcg oom in B > > > - default settings: the biggest process is killed > > > - A/memory.group_oom=1: the biggest process is killed > > > > Huh? Why would you even consider A here when the oom is below it? > > /me confused > > I do not. > This is exactly a counter-example: A's memory.group_oom > is not considered at all in this case, > because A is above ooming cgroup. OK, it confused me. > > > > > - B/memory.group_oom=1: all processes in B are killed > > > > - B/memory.group_oom=0 && > > > - D/memory.group_oom=1: all processes in D are killed > > > > What about? > > - B/memory.group_oom=1 && D/memory.group_oom=0 > > All tasks in B are killed. so essentially find a task, traverse the memcg hierarchy from the victim's memcg up to the oom root as long as memcg.group_oom = 1? If the resulting memcg.group_oom == 1 then kill the whole sub tree. Right? > Group_oom set to 1 means that the workload can't tolerate > killing of a random process, so in this case it's better > to guarantee consistency for B. OK, but then if D itself is OOM then we do not care about consistency all of the sudden? I have hard time to think about a sensible usecase. -- Michal Hocko SUSE Labs