Re: cgroup-aware OOM killer, how to move forward

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 24-07-18 08:52:51, Roman Gushchin wrote:
> On Tue, Jul 24, 2018 at 07:49:40AM -0700, Tejun Heo wrote:
> > Hello, Michal.
> > 
> > On Tue, Jul 24, 2018 at 04:43:51PM +0200, Michal Hocko wrote:
> > > If yes, then I do not see it ;) Mostly because panic_on_oom doesn't have
> > > any scope. It is all or nothing thing. You can only control whether
> > > memcg OOMs should be considered or not because this is inherently
> > > dangerous to be the case by default.
> > 
> > Oh yeah, so, panic_on_oom is like group oom on the root cgroup, right?
> > If 1, it treats the whole system as a single unit and kills it no
> > matter the oom domain.  If 2, it only does so if the oom is not caused
> > by restrictions in subdomains.
> > 
> > > oom_group has a scope and that scope is exactly what we are trying to
> > > find a proper semantic for. And especially what to do if descendants in
> > > the hierarchy disagree with parent(s). While I do not see a sensible
> > > configuration where the scope of the OOM should define the workload is
> > > indivisible I would like to prevent from "carved in stone" semantic that
> > > couldn't be changed later.
> > 
> > And we can scope it down the same way down the cgroup hierarchy.
> > 
> > > So IMHO the best option would be to simply inherit the group_oom to
> > > children. This would allow users to do their weird stuff but the default
> > > configuration would be consistent.
> 
> I think, that the problem occurs because of the default value (0).
> 
> Let's imagine we can make default to 1.
> It means, that by default we kill the whole sub-tree up to the top-level
> cgroup, and it does guarantee consistency.
> If on some level userspace _knows_ how to handle OOM, it opts-out
> by setting oom.group to 0.

Apart that default group_oom is absolutely unacceptable as explained earlier.
I still fail to see how this makes situation any different. So say you know
that you are not group oom so what will happen now. As soon as well
check parents we are screwed the same way. Not to mention that a global
oom would mean killing the world basically...

-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux