Re: [PATCH] mm/page_alloc: Wait for oom_lock before retrying.

Petr Mladek <pmladek@xxxxxxxx> · Wed, 14 Dec 2016 13:36:44 +0100

On Wed 2016-12-14 20:37:51, Tetsuo Handa wrote:
> Petr Mladek wrote:
> > On Tue 2016-12-13 21:06:57, Tetsuo Handa wrote:
> > > Uptime > 400 are testcases where the stresser was invoked via "taskset -c 0".
> > > Since there are some "** XXX printk messages dropped **" messages, I can't
> > > tell whether the OOM killer was able to make forward progress. But guessing
> > >  from the result that there is no corresponding "Killed process" line for
> > > "Out of memory: " line at uptime = 450 and the duration of PID 14622 stalled,
> > > I think it is OK to say that the system got stuck because the OOM killer was
> > > not able to make forward progress.
> > 
> > I am afraid that as long as you see "** XXX printk messages dropped
> > **" then there is something that is able to keep warn_alloc() busy,
> > never leave the printk()/console_unlock() and and block OOM killer
> > progress.
> 
> Excuse me, but it is not warn_alloc() but functions that call printk()
> which are kept busy with oom_lock held (e.g. oom_kill_process()).

No, they are keeping busy each other. If I get it properly,
this is a livelock:

First, OOM killer stalls inside console_unlock() because
other processes produce new messages faster than it is able to
push to console.

Second, the other processes stall because they are waiting for
the OOM killer to get some free memory.

Now, the blocked processes try to inform about the situation
and produce that many messages. But there are also other
producers, like the hung task detector that see the problems
from outside and tries to inform about it as well.

There are basically two solution for this situation:

1. Fix printk() so that it does not block forever. This will
   get solved by the async printk patchset[*]. In the meantime,
   a particular sensitive location might be worked around
   by using printk_deferred() instead of printk()[**]

2. Reduce the amount of messages. It is insane to report
   the same problem many times so that the same messages
   fill the entire log buffer. Note that the allocator
   is not the only sinner here.

In fact, both solutions makes sense together.

[*] The async printk patchset is flying around in many
    modifications for years. I am more optimistic after
    the discussions on the last Kernel Summit. Anyway,
    it will not be in mainline before 4.12.

[**] printk_deferred() only puts massages into the log
     buffer. It does not call
     console_trylock()/console_unlock(). Therefore,
     it is always "fast".

Best Regards,
Petr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>