Re: [RFC] panic_on_oom_timeout

Michal Hocko <mhocko@xxxxxxx> · Wed, 10 Jun 2015 17:56:47 +0200

On Wed 10-06-15 16:28:01, Michal Hocko wrote:
> On Wed 10-06-15 21:20:58, Tetsuo Handa wrote:
[...]
> > Since my version uses per a "struct task_struct" variable (memdie_start),
> > 5 seconds of timeout is checked for individual memory cgroup. It can avoid
> > unnecessary panic() calls if nobody needs to call out_of_memory() again
> > (probably because somebody volunteered memory) when the OOM victim cannot
> > be terminated for some reason. If we want distinction between "the entire
> > system is under OOM" and "some memory cgroup is under OOM" because the
> > former is urgent but the latter is less urgent, it can be modified to
> > allow different timeout period for system-wide OOM and cgroup OOM.
> > Finally, it can give a hint for "in what sequence threads got stuck" and
> > "which thread did take 5 seconds" when analyzing vmcore.
> 
> I will have a look how you have implemented that but separate timeouts
> sound like a major over engineering. Also note that global vs. memcg OOM
> is not sufficient because there are other oom domains as mentioned above.

Your patch is doing way too many things at once :/ So let me just focus
on the "panic if a task is stuck with TIF_MEMDIE for too long". It looks
like an alternative to the approach I've chosen. It doesn't consider
the allocation restriction so a locked up cpuset/numa node(s) might
panic the system which doesn't sound like a good idea but that is easily
fixable. Could you tear just this part out and repost it so that we can
compare the two approaches?

The panic_on_oom=2 would be still weird because some nodes might stay in
OOM condition without triggering the panic but maybe this is acceptable.

Thanks!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>