Re: [PATCH] oom_kill: add option to disable dump_stack()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 23 Oct 2015, Aristeu Rozanski wrote:

> One of the largest chunks of log messages in a OOM is from dump_stack() and in
> some cases it isn't even necessary to figure out what's going on. In
> systems with multiple tenants/containers with limited resources each
> OOMs can be way more frequent and being able to reduce the amount of log
> output for each situation is useful.
> 
> This patch adds a sysctl to allow disabling dump_stack() during an OOM while
> keeping the default to behave the same way it behaves today.
> 
> Cc: Greg Thelen <gthelen@xxxxxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx
> Cc: cgroups@xxxxxxxxxxxxxxx
> Signed-off-by: Aristeu Rozanski <arozansk@xxxxxxxxxx>

There's lots of information in the oom log that is irrelevant depending on 
the context in which the oom condition occurred.  Removing the stack trace 
would have made things like commit 9a185e5861e8 ("/proc/stat: convert to 
single_open_size()") harder to fix.  In that case, we were calling the oom 
killer on large file reads from procfs when we could have easily have 
used vmalloc() instead.

When you have a memcg oom kill, the state of the system's memory can 
usually be suppressed because it only occurred because a memcg hierarchy 
reached its limit and has nothing to do with the exhaustion of RAM.

We already control oom output with global sysctls like vm.oom_dump_tasks 
and memcg tunables like memory.oom_verbose.  I'm not sure that adding more 
and more tunables simply to control the oom killer output is in the best 
interest of either procfs or a long-term maintainable kernel.

I can understand the usefulness of having a very small amount of output to 
the kernel log and then enabling tunables to investigate why oom kills are 
happening, but in many situations I've found to only have the oom killer 
output left behind in a kernel log and the situation is not on-going so I 
can't start diagnosing the problem if I don't know what triggered it.

I think adding additional sysctls to control oom killer output is in the 
wrong direction.  I do agree with removing anything that is irrelevant in 
all situations, however.
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux