Re: [patch -mm 0/4] mm, memcg: introduce oom policies

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 17 Jan 2018, Roman Gushchin wrote:

> You're introducing a new oom_policy knob, which has two separate sets
> of possible values for the root and non-root cgroups. I don't think
> it aligns with the existing cgroup v2 design.
> 

The root mem cgroup can use "none" or "cgroup" to either disable or 
enable, respectively, the cgroup aware oom killer.  These are available to 
non root mem cgroups as well, with the addition of hierarchical usage 
comparison to prevent the ability to completely evade the oom killer by 
the user by creating sub cgroups.  Just as we have files that are made 
available on the root cgroup and not others, I think it's fine to allow 
only certain policies on the root mem cgroup.  As I wrote to Tejun, I'm 
fine with the policy being separated from the mechanism.  That can be 
resolved in that email thread, but the overall point of this patchset is 
to allow hierarchical comparison on some subtrees and not on others.  We 
can avoid the mount option in the same manner.

> For the root cgroup it works exactly as mount option, and both "none"
> and "cgroup" values have no meaning outside of the root cgroup. We can
> discuss if a knob on root cgroup is better than a mount option, or not
> (I don't think so), but it has nothing to do with oom policy as you
> define it for non-root cgroups.
> 

It certainly does have value outside of the root cgroup: for system oom 
conditions it is possible to choose the largest process, largest single 
cgroup, or largest subtree to kill from.  For memcg oom conditions it's 
possible for a consumer in a subtree to define whether it wants the 
largest memory hogging process to be oom killed or the largest of its 
child sub cgroups.  This would be needed for a job scheduler in its own 
subtree, for example, that creates its own subtrees to limit jobs 
scheduled by individual users who have the power over their subtree but 
not their limit.  This is a real-world example.  Michal also provided his 
own example concerning top-level /admins and /students.  They want to use 
"cgroup" themselves and then /students/roman would be forced to "tree".

Keep in mind that this patchset alone makes the interface extensible and 
addresses one of my big three concerns, but the comparison of the root mem 
cgroup compared to other individual cgroups and cgroups maintaining a 
subtree needs to be fixed separately so that we don't completely evade all 
of these oom policies by using /proc/pid/oom_score_adj in the root mem 
cgroup.  That's a separate issue that needs to be addressed.  My last 
concern, about userspace influence, can probably be addressed after this 
is merged by yet another tunable since it's vital that important cgroups 
can be protected.  It does make sense for an important cgroup to be able 
to use 50% of memory without being oom killed, and that's impossible if 
cgroup v2 has been mounted with your mount option.
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux