Re: cgroup-aware OOM killer, how to move forward

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 13, 2018 at 02:34:49PM -0700, David Rientjes wrote:
> On Wed, 11 Jul 2018, Roman Gushchin wrote:
> 
> > I was thinking on how to move forward with the cgroup-aware OOM killer.
> > It looks to me, that we all agree on the "cleanup" part of the patchset:
> > it's a nice feature to be able to kill all tasks in the cgroup
> > to guarantee the consistent state of the workload.
> > All our disagreements are related to the victim selection algorithm.
> > 
> > So, I wonder, if the right thing to do is to split the problem.
> > We can agree on the "cleanup" part, which is useful by itself,
> > merge it upstream, and then return to the victim selection
> > algorithm.
> > 
> > So, here is my proposal:
> > let's introduce the memory.group_oom knob with the following semantics:
> > if the knob is set, the OOM killer can kill either none, either all
> > tasks in the cgroup*.
> > It can perfectly work with the current OOM killer (as a "cleanup" option),
> > and allows _any_ further approach on the OOM victim selection.
> > It also doesn't require any mount/boot/tree-wide options.
> > 
> > How does it sound?
> > 
> 
> No objection, of course, this was always the mechanism vs policy 
> separation that I was referring to.  Having the ability to kill all 
> processes attached to the cgroup when one of its processes is selected is 
> useful, and we have our own patches that do just that, with the exception 
> that it's triggerable by the user.

Perfect! I'll prepare the patchset.

> 
> One of the things that I really like about cgroup v2, though, is what 
> appears to be an implicit, but rather apparent, goal to minimize the 
> number of files for each controller.  It's very clean.  So I'd suggest 
> that we consider memory.group_oom, or however it is named, to allow for 
> future development.
> 
> For example, rather than simply being binary, we'd probably want the 
> ability to kill all eligible processes attached directly to the victim's 
> mem cgroup *or* all processes attached to its subtree as well.
> 
> I'd suggest it be implemented to accept a string, "default"/"process", 
> "local" or "tree"/"hierarchy", or better names, to define the group oom 
> mechanism for the mem cgroup that is oom when one of its processes is 
> selected as a victim.

I would prefer to keep it boolean to match the simplicity of cgroup v2 API.
In v2 hierarchy processes can't be attached to non-leaf cgroups,
so I don't see the place for the 3rd meaning.

> 
> > * More precisely: if the OOM killer kills a task,
> > it will traverse the cgroup tree up to the OOM domain (OOMing memcg or root),
> > looking for the highest-level cgroup with group_oom set. Then it will
> > kill all tasks in such cgroup, if it does exist.
> > 
> 
> All such processes that are not oom disabled, yes.
> 

Yep, of course.

Thanks!




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux