[patch 01/17] mm: memcontrol: print proper OOM header when no eligible victim left

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Johannes Weiner <hannes@xxxxxxxxxxx>
Subject: mm: memcontrol: print proper OOM header when no eligible victim left

When the memcg OOM killer runs out of killable tasks, it currently prints
a WARN with no further OOM context.  This has caused some user confusion.

Warnings indicate a kernel problem.  In a reported case, however, the
situation was triggered by a nonsensical memcg configuration (hard limit
set to 0).  But without any VM context this wasn't obvious from the
report, and it took some back and forth on the mailing list to identify
what is actually a trivial issue.

Handle this OOM condition like we handle it in the global OOM killer: dump
the full OOM context and tell the user we ran out of tasks.

This way the user can identify misconfigurations easily by themselves and
rectify the problem - without having to go through the hassle of running
into an obscure but unsettling warning, finding the appropriate kernel
mailing list and waiting for a kernel developer to remote-analyze that the
memcg configuration caused this.

If users cannot make sense of why the OOM killer was triggered or why it
failed, they will still report it to the mailing list, we know that from
experience.  So in case there is an actual kernel bug causing this, kernel
developers will very likely hear about it.

Link: http://lkml.kernel.org/r/20180821160406.22578-1-hannes@xxxxxxxxxxx
Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxxx>
Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/memcontrol.c |    2 --
 mm/oom_kill.c   |   13 ++++++++++---
 2 files changed, 10 insertions(+), 5 deletions(-)

--- a/mm/memcontrol.c~mm-memcontrol-print-proper-oom-header-when-no-eligible-victim-left
+++ a/mm/memcontrol.c
@@ -1701,8 +1701,6 @@ static enum oom_status mem_cgroup_oom(st
 	if (mem_cgroup_out_of_memory(memcg, mask, order))
 		return OOM_SUCCESS;
 
-	WARN(1,"Memory cgroup charge failed because of no reclaimable memory! "
-		"This looks like a misconfiguration or a kernel bug.");
 	return OOM_FAILED;
 }
 
--- a/mm/oom_kill.c~mm-memcontrol-print-proper-oom-header-when-no-eligible-victim-left
+++ a/mm/oom_kill.c
@@ -1103,10 +1103,17 @@ bool out_of_memory(struct oom_control *o
 	}
 
 	select_bad_process(oc);
-	/* Found nothing?!?! Either we hang forever, or we panic. */
-	if (!oc->chosen && !is_sysrq_oom(oc) && !is_memcg_oom(oc)) {
+	/* Found nothing?!?! */
+	if (!oc->chosen) {
 		dump_header(oc, NULL);
-		panic("Out of memory and no killable processes...\n");
+		pr_warn("Out of memory and no killable processes...\n");
+		/*
+		 * If we got here due to an actual allocation at the
+		 * system level, we cannot survive this and will enter
+		 * an endless loop in the allocator. Bail out now.
+		 */
+		if (!is_sysrq_oom(oc) && !is_memcg_oom(oc))
+			panic("System is deadlocked on memory\n");
 	}
 	if (oc->chosen && oc->chosen != (void *)-1UL)
 		oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" :
_



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux