Re: [PATCH] mm,oom: Re-enable OOM killer using timeout.

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Wed, 20 Apr 2016 06:55:42 +0900

Michal Hocko wrote:
> > This patch adds a timeout for handling corner cases where a TIF_MEMDIE
> > thread got stuck. Since the timeout is checked at oom_unkillable_task(),
> > oom_scan_process_thread() will not find TIF_MEMDIE thread
> > (for !oom_kill_allocating_task case) and oom_badness() will return 0
> > (for oom_kill_allocating_task case).
> > 
> > By applying this patch, the kernel will automatically press SysRq-f if
> > the OOM reaper cannot reap the victim's memory, and we will never OOM
> > livelock forever as long as the OOM killer is called.
> 
> Which will not guarantee anything as already pointed out several times
> before. So I think this is not really that useful. I have said it
> earlier and will repeat it again. Any timeout based solution which
> doesn't guarantee that the system will be in a consistent state (reboot,
> panic or kill all existing tasks) after the specified timeout is
> pointless.

Triggering the reboot/panic is the worst action. Killing all existing tasks
is the next worst action. Thus, I prefer killing tasks one by one.

I'm OK with shortening the timeout like N (when waiting for the 1st victim)
+ N/2 (the 2nd victim) + N/4 (the 3rd victim) + N/8 (the 4th victim) + ...
but does it worth complicating the least unlikely path?

> 
> I believe that the chances of the lockup are much less likely with the
> oom reaper and that we are not really urged to provide a new knob with a
> random semantic. If we really want to have a timeout based thing better
> make it behave reliably.

The threshold which the administrator can wait for ranges. Some may want to
set few seconds because of 10 seconds /dev/watchdog timeout, others may want
to set one minute because of not using watchdog. Thus, I think we should not
hard code the timeout.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>