2017-05-19 15:22 GMT+01:00 Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>: > Show count of global oom killer invocations in /proc/vmstat and > count of oom kills inside memory cgroup in knob "memory.events" > (in memory.oom_control for v1 cgroup). > > Also describe difference between "oom" and "oom_kill" in memory > cgroup documentation. Currently oom in memory cgroup kills tasks > iff shortage has happened inside page fault. > > These counters helps in monitoring oom kills - for now > the only way is grepping for magic words in kernel log. > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> > --- > Documentation/cgroup-v2.txt | 12 +++++++++++- > include/linux/memcontrol.h | 1 + > include/linux/vm_event_item.h | 1 + > mm/memcontrol.c | 2 ++ > mm/oom_kill.c | 6 ++++++ > mm/vmstat.c | 1 + > 6 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt > index dc5e2dcdbef4..a742008d76aa 100644 > --- a/Documentation/cgroup-v2.txt > +++ b/Documentation/cgroup-v2.txt > @@ -830,9 +830,19 @@ PAGE_SIZE multiple when read back. > > oom > > + The number of time the cgroup's memory usage was > + reached the limit and allocation was about to fail. > + Result could be oom kill, -ENOMEM from any syscall or > + completely ignored in cases like disk readahead. > + For now oom in memory cgroup kills tasks iff shortage > + has happened inside page fault. >From a user's point of view the difference between "oom" and "max" becomes really vague here, assuming that "max" is described almost in the same words: "The number of times the cgroup's memory usage was about to go over the max boundary. If direct reclaim fails to bring it down, the OOM killer is invoked." I wonder, if it's better to fix the existing "oom" value to show what it has to show, according to docs, rather than to introduce a new one? > + > + oom_kill > + > The number of times the OOM killer has been invoked in > the cgroup. This may not exactly match the number of > - processes killed but should generally be close. > + processes killed but should generally be close: each > + invocation could kill several processes at once. > > memory.stat > -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html