On Thu 19-05-16 23:29:38, Tetsuo Handa wrote: > Michal Hocko wrote: > > Tetsuo has properly noted that mmput slow path might get blocked waiting > > for another party (e.g. exit_aio waits for an IO). If that happens the > > oom_reaper would be put out of the way and will not be able to process > > next oom victim. We should strive for making this context as reliable > > and independent on other subsystems as much as possible. > > > > Introduce mmput_async which will perform the slow path from an async > > (WQ) context. This will delay the operation but that shouldn't be a > > problem because the oom_reaper has reclaimed the victim's address space > > for most cases as much as possible and the remaining context shouldn't > > bind too much memory anymore. The only exception is when mmap_sem > > trylock has failed which shouldn't happen too often. > > > > The issue is only theoretical but not impossible. > > Just a random thought, but after this patch is applied, do we still need to use > a dedicated kernel thread for OOM-reap operation? If I recall correctly, the > reason we decided to use a dedicated kernel thread was that calling > down_read(&mm->mmap_sem) / mmput() from the OOM killer context is unsafe due to > dependency. By replacing mmput() with mmput_async(), since __oom_reap_task() will > no longer do operations that might block, can't we try OOM-reap operation from > current thread which called mark_oom_victim() or oom_scan_process_thread() ? I was already thinking about that. It is true that the main blocker was the mmput, as you say, but the dedicated kernel thread seems to be more robust locking and stack wise. So I would prefer staying with the current approach until we see that it is somehow limitting. One pid and kernel stack doesn't seem to be a terrible price to me. But as I've said I am not bound to the kernel thread approach... > I want to start waking up the OOM reaper whenever TIF_MEMDIE is set or found. > > Using a dedicated kernel thread is still better because memory allocation path > already consumed a lot of kernel stack? But we don't need to give up OOM-reaping > when kthread_run() failed. Is kthread_run failure during early boot even an option? Isn't such a system screwed up by definition? -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>