Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

Michal Hocko <mhocko@xxxxxxx> · Fri, 8 Feb 2013 10:44:20 +0100

On Fri 08-02-13 06:03:04, azurIt wrote:
> Michal, thank you very much but it just didn't work and broke
> everything :(

I am sorry to hear that. The patch should help to solve the deadlock you
have seen earlier. It in no way can solve side effects of failing writes
and it also cannot help much if the oom is permanent.

> This happened:
> Problem started to occur really often immediately after booting the
> new kernel, every few minutes for one of my users. But everything
> other seems to work fine so i gave it a try for a day (which was a
> mistake). I grabbed some data for you and go to sleep:
> http://watchdog.sk/lkml/memcg-bug-4.tar.gz

Do you have logs from that time period?

I have only glanced through the stacks and most of the threads are
waiting in the mem_cgroup_handle_oom (mostly from the page fault path
where we do not have other options than waiting) which suggests that
your memory limit is seriously underestimated. If you look at the number
of charging failures (memory.failcnt per-group file) then you will get
9332083 failures in _average_ per group. This is a lot!
Not all those failures end with OOM, of course. But it clearly signals
that the workload need much more memory than the limit allows.

> Few hours later i was woke up from my sweet sweet dreams by alerts
> smses - Apache wasn't working and our system failed to restart
> it. When i observed the situation, two apache processes (of that user
> as above) were still running and it wasn't possible to kill them by
> any way. I grabbed some data for you:
> http://watchdog.sk/lkml/memcg-bug-5.tar.gz

There are only 5 groups in this one and all of them have no memory
charged (so no OOM going on). All tasks are somewhere in the ptrace
code.

grep cache -r .
./1360297489/memory.stat:cache 0
./1360297489/memory.stat:total_cache 65642496
./1360297491/memory.stat:cache 0
./1360297491/memory.stat:total_cache 65642496
./1360297492/memory.stat:cache 0
./1360297492/memory.stat:total_cache 65642496
./1360297490/memory.stat:cache 0
./1360297490/memory.stat:total_cache 65642496
./1360297488/memory.stat:cache 0
./1360297488/memory.stat:total_cache 65642496

which suggests that this is a parent group and the memory is charged in
a child group. I guess that all those are under OOM as the number seems
like they have limit at 62M.

> Then I logged to the console and this was waiting for me:
> http://watchdog.sk/lkml/error.jpg

This is just a warning and it should be harmless. There is just one WARN
in ptrace_check_attach:
	WARN_ON_ONCE(task_is_stopped(child))

This has been introduced by
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=321fb561
and the commit description claim this shouldn't happen. I am not
familiar with this code but it sounds like a bug in the tracing code
which is not related to the discussed issue.

> Finally i rebooted into different kernel, wrote this e-mail and go to
> my lovely bed ;)
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html