On Wed, 21 Jan 2009, Nikanth Karthikesan wrote: > This is a container group based approach to override the oom killer selection > without losing all the benefits of the current oom killer heuristics and > oom_adj interface. > > It adds a tunable oom.victim to the oom cgroup. The oom killer will kill the > process using the usual badness value but only within the cgroup with the > maximum value for oom.victim before killing any process from a cgroup with a > lesser oom.victim number. Oom killing could be disabled by setting > oom.victim=0. > This doesn't help in memcg or cpuset constrained oom conditions, which still go through select_bad_process(). If the oom.victim value is high for a specific cgroup and a memory controller oom occurs in a disjoint cgroup, for example, it's possible to needlessly kill tasks. Obviously that is up to the administrator to configure, but may not be his or her desire for system-wide oom conditions. It may be preferred to kill tasks in a specific cgroup first when the entire system is out of memory or kill tasks within a cgroup attached to a memory controller when it is oom. The same scenario applies for cpuset-constrained ooms. Since oom.victim is given higher preference than all tasks' oom_adj values, it is possible to needlessly kill tasks that do not lead to future memory freeing for the nodes attached to that cpuset. It also requires that you synchronize the oom.victim values amongst your cgroups. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers