Re: cgroup-aware OOM killer, how to move forward

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 20 Jul 2018, Roman Gushchin wrote:

> > > > process chosen for oom kill.  I know that you care about the latter.  My 
> > > > *only* suggestion was for the tunable to take a string instead of a 
> > > > boolean so it is extensible for future use.  This seems like something so 
> > > > trivial.
> > > 
> > > So, I'd much prefer it as boolean.  It's a fundamentally binary
> > > property, either handle the cgroup as a unit when chosen as oom victim
> > > or not, nothing more.
> > 
> > With the single hierarchy mandate of cgroup v2, the need arises to 
> > separate processes from a single job into subcontainers for use with 
> > controllers other than mem cgroup.  In that case, we have no functionality 
> > to oom kill all processes in the subtree.
> > 
> > A boolean can kill all processes attached to the victim's mem cgroup, but 
> > cannot kill all processes in a subtree if the limit of a common ancestor 
> > is reached.
> 
> Why so?
> 
> Once again my proposal:
> as soon as the OOM killer selected a victim task,
> we'll look at the victim task's memory cgroup.
> If memory.oom.group is not set, we're done.
> Otherwise let's traverse the memory cgroup tree up to
> the OOMing cgroup (or root) as long as memory.oom.group is set.
> Kill the last cgroup entirely (including all children).
> 

I know this is your proposal, I'm suggesting a context-based extension 
based on which mem cgroup is oom: the common ancestor or the leaf.

Consider /A, /A/b, and /A/c, and memory.oom_group is 1 for all of them.  
When /A, /A/b, or /A/c is oom, all processes attached to /A and its 
subtree are oom killed per your semantic.  That occurs when any of the 
three mem cgroups are oom.

I'm suggesting that it may become useful to kill an entire subtree when 
the common ancestor, /A, is oom, but not when /A/b or /A/c is oom.  There 
is no way to specify this with the proposal and trees where the limits of
/A/b + /A/c > /A exist.  We want all processes killed in /A/b or /A/c if 
they reach their individual limits.  We want all processes killed in /A's 
subtree if /A reaches its limit.

I am not asking for that support to be implemented immediately if you do 
not have a need for it.  But I am asking that your interface to do so is 
extensible so that we may implement it.  Given the no internal process 
constraint of cgroup v2, defining this as two separate tunables would 
always have one be effective and the other be irrelevant, so I suggest it 
is overloaded.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux