[PATCH v2] mm, memcg: fix inconsistent oom event behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
memory.events") changes the behavior of memcg events, which will
consider subtrees in memory.events. But oom_kill event is a special one
as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
in memory.oom_control. The file memory.oom_control is in both root memcg
and non root memcg, that is different with memory.event as it only in
non-root memcg. That commit is okay for cgroup2, but it is not okay for
cgroup1 as it will cause inconsistent behavior between root memcg and
non-root memcg.

Here's an example on why this behavior is inconsistent in cgroup1.
     root memcg
     /
  memcg foo
   /
memcg bar

Suppose there's an oom_kill in memcg bar, then the oon_kill will be

     root memcg : memory.oom_control(oom_kill)  0
     /
  memcg foo : memory.oom_control(oom_kill)  1
   /
memcg bar : memory.oom_control(oom_kill)  1

For the non-root memcg, its memory.oom_control(oom_kill) includes its
descendants' oom_kill, but for root memcg, it doesn't include its
descendants' oom_kill. That means, memory.oom_control(oom_kill) has
different meanings in different memcgs. That is inconsistent. Then the user
has to know whether the memcg is root or not.

If we can't fully support it in cgroup1, for example by adding
memory.events.local into cgroup1 as well, then let's don't touch
its original behavior.

Setting CGRP_ROOT_MEMORY_LOCAL_EVENTS for legacy hierarchy by
default rather than special casing it somewhere quite deep in the code
would be better, per discussion with Michal.

Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events")
Cc: Chris Down <chris@xxxxxxxxxxxxxx>
Cc: Shakeel Butt <shakeelb@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
---
 mm/memcontrol.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5beea03dd58a..0f7381bddcee 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5940,10 +5940,20 @@ static void mem_cgroup_bind(struct cgroup_subsys_state *root_css)
 	 * guarantees that @root doesn't have any children, so turning it
 	 * on for the root memcg is enough.
 	 */
-	if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
+	if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
 		root_mem_cgroup->use_hierarchy = true;
-	else
+	} else {
 		root_mem_cgroup->use_hierarchy = false;
+		/*
+		 * Set CGRP_ROOT_MEMORY_LOCAL_EVENTS for legacy hierarchy
+		 * by default to avoid inconsistent oom_kill behavior
+		 * between root memcg and non-root memcg.
+		 * Regarding default hierarchy, as this flag will be set
+		 * or cleared later, we don't need to process it in this
+		 * function.
+		 */
+		cgrp_dfl_root.flags |= CGRP_ROOT_MEMORY_LOCAL_EVENTS;
+	}
 }
 
 static int seq_puts_memcg_tunable(struct seq_file *m, unsigned long value)
-- 
2.18.2





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux