[alternative-merged] mm-oom-add-cgroup-v2-mount-option-for-cgroup-aware-oom-killer.patch removed from -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer
has been removed from the -mm tree.  Its filename was
     mm-oom-add-cgroup-v2-mount-option-for-cgroup-aware-oom-killer.patch

This patch was dropped because an alternative patch was merged

------------------------------------------------------
From: Roman Gushchin <guro@xxxxxx>
Subject: mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

Add a "groupoom" cgroup v2 mount option to enable the cgroup-aware OOM
killer.  If not set, the OOM selection is performed in a "traditional"
per-process way.

The behavior can be changed dynamically by remounting the cgroupfs.

Link: http://lkml.kernel.org/r/20171130152824.1591-6-guro@xxxxxx
Signed-off-by: Roman Gushchin <guro@xxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
From: Roman Gushchin <guro@xxxxxx>
Subject: mm, oom, docs: describe the cgroup-aware OOM killer

Document the cgroup-aware OOM killer.

Link: http://lkml.kernel.org/r/20171130152824.1591-7-guro@xxxxxx
Signed-off-by: Roman Gushchin <guro@xxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>
Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxxxx>
From: Roman Gushchin <guro@xxxxxx>
Subject: mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix

Add a note that cgroup-aware OOM logic is disabled by default
and describe how to enable it.

Link: http://lkml.kernel.org/r/20171201170149.GB27436@xxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Roman Gushchin <guro@xxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
From: Michal Hocko <mhocko@xxxxxxxx>
Subject: oom, memcg: clarify root memcg oom accounting

David Rientjes has pointed out that the current way how the root memcg is
accounted for the cgroup aware OOM killer is undocumented.  Unlike regular
cgroups there is no accounting going on in the root memcg (mostly for
performance reasons).  Therefore we are suming up oom_badness of its
tasks.  This might result in an over accounting because of the
oom_score_adj setting.  Document this for now.

Link: http://lkml.kernel.org/r/20180130122011.GB21609@xxxxxxxxxxxxxx
Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
Acked-by: Roman Gushchin <guro@xxxxxx>
Acked-by: Tejun Heo <tj@xxxxxxxxxx>
Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Reviewed-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>
Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 Documentation/admin-guide/cgroup-v2.rst |   74 ++++++++++++++++++++++
 include/linux/cgroup-defs.h             |    5 +
 kernel/cgroup/cgroup.c                  |   10 ++
 mm/memcontrol.c                         |    3 
 4 files changed, 92 insertions(+)

--- a/include/linux/cgroup-defs.h~mm-oom-add-cgroup-v2-mount-option-for-cgroup-aware-oom-killer
+++ a/include/linux/cgroup-defs.h
@@ -81,6 +81,11 @@ enum {
 	 * Enable cpuset controller in v1 cgroup to use v2 behavior.
 	 */
 	CGRP_ROOT_CPUSET_V2_MODE = (1 << 4),
+
+	/*
+	 * Enable cgroup-aware OOM killer.
+	 */
+	CGRP_GROUP_OOM = (1 << 5),
 };
 
 /* cftype->flags */
--- a/kernel/cgroup/cgroup.c~mm-oom-add-cgroup-v2-mount-option-for-cgroup-aware-oom-killer
+++ a/kernel/cgroup/cgroup.c
@@ -1747,6 +1747,9 @@ static int parse_cgroup_root_flags(char
 		if (!strcmp(token, "nsdelegate")) {
 			*root_flags |= CGRP_ROOT_NS_DELEGATE;
 			continue;
+		} else if (!strcmp(token, "groupoom")) {
+			*root_flags |= CGRP_GROUP_OOM;
+			continue;
 		}
 
 		pr_err("cgroup2: unknown option \"%s\"\n", token);
@@ -1763,6 +1766,11 @@ static void apply_cgroup_root_flags(unsi
 			cgrp_dfl_root.flags |= CGRP_ROOT_NS_DELEGATE;
 		else
 			cgrp_dfl_root.flags &= ~CGRP_ROOT_NS_DELEGATE;
+
+		if (root_flags & CGRP_GROUP_OOM)
+			cgrp_dfl_root.flags |= CGRP_GROUP_OOM;
+		else
+			cgrp_dfl_root.flags &= ~CGRP_GROUP_OOM;
 	}
 }
 
@@ -1770,6 +1778,8 @@ static int cgroup_show_options(struct se
 {
 	if (cgrp_dfl_root.flags & CGRP_ROOT_NS_DELEGATE)
 		seq_puts(seq, ",nsdelegate");
+	if (cgrp_dfl_root.flags & CGRP_GROUP_OOM)
+		seq_puts(seq, ",groupoom");
 	return 0;
 }
 
--- a/mm/memcontrol.c~mm-oom-add-cgroup-v2-mount-option-for-cgroup-aware-oom-killer
+++ a/mm/memcontrol.c
@@ -3022,6 +3022,9 @@ bool mem_cgroup_select_oom_victim(struct
 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
 		return false;
 
+	if (!(cgrp_dfl_root.flags & CGRP_GROUP_OOM))
+		return false;
+
 	if (oc->memcg)
 		root = oc->memcg;
 	else
--- a/Documentation/admin-guide/cgroup-v2.rst~mm-oom-add-cgroup-v2-mount-option-for-cgroup-aware-oom-killer
+++ a/Documentation/admin-guide/cgroup-v2.rst
@@ -48,6 +48,7 @@ v1 is available under Documentation/cgro
        5-2-1. Memory Interface Files
        5-2-2. Usage Guidelines
        5-2-3. Memory Ownership
+       5-2-4. OOM Killer
      5-3. IO
        5-3-1. IO Interface Files
        5-3-2. Writeback
@@ -1069,6 +1070,31 @@ PAGE_SIZE multiple when read back.
 	high limit is used and monitored properly, this limit's
 	utility is limited to providing the final safety net.
 
+  memory.oom_group
+
+	A read-write single value file which exists on non-root
+	cgroups.  The default is "0".
+
+	If set, OOM killer will consider the memory cgroup as an
+	indivisible memory consumers and compare it with other memory
+	consumers by it's memory footprint.
+	If such memory cgroup is selected as an OOM victim, all
+	processes belonging to it or it's descendants will be killed.
+
+	This applies to system-wide OOM conditions and reaching
+	the hard memory limit of the cgroup and their ancestor.
+	If OOM condition happens in a descendant cgroup with it's own
+	memory limit, the memory cgroup can't be considered
+	as an OOM victim, and OOM killer will not kill all belonging
+	tasks.
+
+	Also, OOM killer respects the /proc/pid/oom_score_adj value -1000,
+	and will never kill the unkillable task, even if memory.oom_group
+	is set.
+
+	If cgroup-aware OOM killer is not enabled, ENOTSUPP error
+	is returned on attempt to access the file.
+
   memory.events
 	A read-only flat-keyed file which exists on non-root cgroups.
 	The following entries are defined.  Unless specified
@@ -1293,6 +1319,54 @@ to be accessed repeatedly by other cgrou
 POSIX_FADV_DONTNEED to relinquish the ownership of memory areas
 belonging to the affected files to ensure correct memory ownership.
 
+OOM Killer
+~~~~~~~~~~
+
+Cgroup v2 memory controller implements a cgroup-aware OOM killer.
+It means that it treats cgroups as first class OOM entities.
+
+Cgroup-aware OOM logic is turned off by default and requires
+passing the "groupoom" option on mounting cgroupfs. It can also
+by remounting cgroupfs with the following command::
+
+  # mount -o remount,groupoom $MOUNT_POINT
+
+Under OOM conditions the memory controller tries to make the best
+choice of a victim, looking for a memory cgroup with the largest
+memory footprint, considering leaf cgroups and cgroups with the
+memory.oom_group option set, which are considered to be an indivisible
+memory consumers.
+
+By default, OOM killer will kill the biggest task in the selected
+memory cgroup. A user can change this behavior by enabling
+the per-cgroup memory.oom_group option. If set, it causes
+the OOM killer to kill all processes attached to the cgroup,
+except processes with oom_score_adj set to -1000.
+
+This affects both system- and cgroup-wide OOMs. For a cgroup-wide OOM
+the memory controller considers only cgroups belonging to the sub-tree
+of the OOM'ing cgroup.
+
+Leaf cgroups and cgroups with oom_group option set are compared based
+on their cumulative memory usage. The root cgroup is treated as a
+leaf memory cgroup as well, so it's compared with other leaf memory
+cgroups. Due to internal implementation restrictions the size of
+the root cgroup is a cumulative sum of oom_badness of all its tasks
+(in other words oom_score_adj of each task is obeyed). Relying on
+oom_score_adj (appart from OOM_SCORE_ADJ_MIN) can lead to over or
+underestimating of the root cgroup consumption and it is therefore
+discouraged. This might change in the future, though.
+
+If there are no cgroups with the enabled memory controller,
+the OOM killer is using the "traditional" process-based approach.
+
+Please, note that memory charges are not migrating if tasks
+are moved between different memory cgroups. Moving tasks with
+significant memory footprint may affect OOM victim selection logic.
+If it's a case, please, consider creating a common ancestor for
+the source and destination memory cgroups and enabling oom_group
+on ancestor layer.
+
 
 IO
 --
_

Patches currently in -mm which might be from guro@xxxxxx are

mm-introduce-mem_cgroup_put-helper.patch
mm-oom-docs-describe-the-cgroup-aware-oom-killer.patch
cgroup-list-groupoom-in-cgroup-features.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux