Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, Linus, Andrew.

On Wed, Jan 10, 2018 at 05:29:00PM +0100, Petr Mladek wrote:
> Where is the acceptable compromise? I am not sure. So far, the most
> forceful people (Linus) did not see softlockups as a big problem.
> They rather wanted to see the messages.

Can you please chime in?  Would you be opposed to offloading to an
independent context even if it were only for cases where we were
already punting?  The thing with the current offloading is that we
don't know who we're offloading to.  It might end up in faster or
slower context, or more importantly a dangerous one.

The particular case that we've been seeing regularly in the fleet was
the following scenario.

1. Console is IPMI emulated serial console.  Super slow.  Also
   netconsole is in use.
2. System runs out of memory, OOM triggers.
3. OOM handler is printing out OOM debug info.
4. While trying to emit the messages for netconsole, the network stack
   / driver tries to allocate memory and then fail, which in turn
   triggers allocation failure or other warning messages.  printk was
   already flushing, so the messages are queued on the ring.
5. OOM handler keeps flushing but 4 repeats and the queue is never
   shrinking.  Because OOM handler is trapped in printk flushing, it
   never manages to free memory and no one else can enter OOM path
   either, so the system is trapped in this state.

The system usually never recovers in time once this sort of condition
hits and the following was the patch that I suggested which only punts
when messages are already being punted and we can easily make it less
punty by delaying the punting by N messages.

 http://lkml.kernel.org/r/20171102135258.GO3252168@xxxxxxxxxxxxxxxxxxxxxxxxxxx

We definitely can fix the above described case by e.g. preventing
printk flushing task from queueing more messages or whatever, but it
just seems really dumb for the system to die from things like this in
general and it doesn't really take all that much to trigger the
condition.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux