Michal Hocko wrote: > On Sun 29-05-16 01:25:14, Tetsuo Handa wrote: > > There has been three problems about SysRq-f (manual invocation of the OOM > > killer) case. To make description simple, this patch assumes situation > > where the OOM reaper is not called (because the OOM victim's mm is shared > > by unkillable threads) or not available (due to kthread_run() failure or > > CONFIG_MMU=n). > > > > First is that moom_callback() is not called by moom_work under OOM > > livelock situation because it does not have a dedicated WQ like vmstat_wq. > > This problem is not fixed yet. > > Why do you mention it in the changelog when it is not related to the > patch then? Just we won't forget about it. > > Btw. you can (ab)use oom_reaper for that purpose. The patch would be > quite trivial. How do you handle CONFIG_MMU=n case? Are we going to provide oom_reaper for CONFIG_MMU=n case? > > > Second is that select_bad_process() chooses a thread group which already > > has a TIF_MEMDIE thread. Since commit f44666b04605d1c7 ("mm,oom: speed up > > select_bad_process() loop") changed oom_scan_process_group() to use > > task->signal->oom_victims, non SysRq-f case will no longer select a > > thread group which already has a TIF_MEMDIE thread. > > I am not sure the reference to the commit is really helpful. The > behavior you are describing below was there before this commit, the only > thing that has changed is the scope of the TIF_MEMDIE check. Indeed. Traversing on all threads can always find a thread group which already has a TIF_MEMDIE thread when there is already a TIF_MEMDIE thread. > > > But SysRq-f case will > > select such thread group due to returning OOM_SCAN_OK. This patch makes > > sure that oom_badness() is skipped by making oom_scan_process_group() to > > return OOM_SCAN_CONTINUE for SysRq-f case. > > I am OK with this part. I was suggesting something similar except I > wanted to skip over tasks which have fatal_signal_pending and that part > got nacked by David AFAIR. Could you make this a separate patch, please? I think it is better to change both part with this patch. > > > Third is that oom_kill_process() chooses a thread group which already > > has a TIF_MEMDIE thread when the candidate select_bad_process() chose > > has children because oom_badness() does not take TIF_MEMDIE into account. > > This patch checks child->signal->oom_victims before calling oom_badness() > > if oom_kill_process() was called by SysRq-f case. This resembles making > > sure that oom_badness() is skipped by returning OOM_SCAN_CONTINUE. > > This makes sense to me as well but why should be limit this to sysrq case? > Does it make any sense to select a child which already got killed for > normal OOM killer? Anyway I think it would be better to split this into > its own patch as well. The reason is described in next paragraph. Do we prefer immediately killing all children of the allocating task? If yes, I think it should be a separate patch on top of this patch because somebody might complain such behavior as a regression. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>