Re: [PATCH 0/10 -v3] Handle oom bypass more gracefully

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michal Hocko wrote:
> OK, so you are arming the timer for each mark_oom_victim regardless
> of the oom context. This means that you have replaced one potential
> lockup by other potential livelocks. Tasks from different oom domains
> might interfere here...
> 
> Also this code doesn't even seem easier. It is surely less lines of
> code but it is really hard to realize how would the timer behave for
> different oom contexts.

If you worry about interference, we can use per signal_struct timestamp.
I used per task_struct timestamp in my earlier versions (where per
task_struct TIF_MEMDIE check was used instead of per signal_struct
oom_victims).

> > What is wrong with above patch? How much difference is there compared to
> > calling schedule_timeout_killable(HZ) in oom_kill_process() before
> > releasing oom_lock and later checking MMF_OOM_REAPED after re-taking
> > oom_lock when we can't wake up the OOM reaper?
> 
> I fail to see how much this is different, really. Your patch is checking
> timer_pending with a global context in the same path and that is imho
> much harder to argue about than something which is task->mm based.

We can use per signal_struct or per task_struct timestamp if you don't
like global timestamp.

> > I'm OK with "a decision based by a feedback" but you don't like waking up
> > the OOM reaper ("invoking the oom reaper just to find out what we know
> > already and it is unlikely to change after oom_kill_process just doesn't
> > make much sense."). So what feedback mechanisms are possible other than
> > timeout like above patch?
> 
> Is this about the patch 10? Well, yes, there is a case where oom reaper
> cannot be invoked and we have no feedback. Then we have no other way
> than to wait for some time. I believe it is easier to wait in the oom
> context directly than to add a global timer. Both approaches would need
> some code in the oom victim selection code and it is much easier to
> argue about the victim specific context than a global one as mentioned
> above.

But expiring timeout by sleeping inside oom_kill_process() prevents other
threads which are OOM-killed from obtaining TIF_MEMDIE, for anybody needs
to wait for oom_lock in order to obtain TIF_MEMDIE. Unless you set
TIF_MEMDIE to all OOM-killed threads from oom_kill_process() or allow
the caller context to use ALLOC_NO_WATERMARKS by checking whether current
was already OOM-killed rather than TIF_MEMDIE, attempt to expiring timeout
by sleeping inside oom_kill_process() is useless.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]