Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 23 Jan 2018, Michal Hocko wrote:

> > It can't, because the current patchset locks the system into a single 
> > selection criteria that is unnecessary and the mount option would become a 
> > no-op after the policy per subtree becomes configurable by the user as 
> > part of the hierarchy itself.
> 
> This is simply not true! OOM victim selection has changed in the
> past and will be always a subject to changes in future. Current
> implementation doesn't provide any externally controlable selection
> policy and therefore the default can be assumed. Whatever that default
> means now or in future. The only contract added here is the kill full
> memcg if selected and that can be implemented on _any_ selection policy.
> 

The current implementation of memory.oom_group is based on top of a 
selection implementation that is broken in three ways I have listed for 
months:

 - allows users to intentionally/unintentionally evade the oom killer,
   requires not locking the selection implementation for the entire
   system, requires subtree control to prevent, makes a mount option
   obsolete, and breaks existing users who would use the implementation
   based on 4.16 if this were merged,

 - unfairly compares the root mem cgroup vs leaf mem cgroup such that
   users must structure their hierarchy only for 4.16 in such a way
   that _all_ processes are under hierarchical control and have no
   power to create sub cgroups because of the point above and
   completely breaks any user of oom_score_adj in a completely
   undocumented and unspecified way, such that fixing that breakage
   would also break any existing users who would use the implementation
   based on 4.16 if this were merged, and

 - does not allow userspace to protect important cgroups, which can be
   built on top.

I'm focused on fixing the breakage in the first two points since it 
affects the API and we don't want to switch that out from the user.  I 
have brought these points up repeatedly and everybody else has actively 
disengaged from development, so I'm proposing incremental changes that 
make the cgroup aware oom killer have a sustainable API and isn't useful 
only for a highly specialized usecase where everything is containerized, 
nobody can create subcgroups, and nobody uses oom_score_adj to break the 
root mem cgroup accounting.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux