Michal Hocko wrote: > On Mon 28-09-15 15:24:06, David Rientjes wrote: > > I agree that i_mutex seems to be one of the most common offenders. > > However, I'm not sure I understand why holding it while trying to allocate > > infinitely for an order-0 allocation is problematic wrt the proposed > > kthread. > > I didn't say it would be problematic. We are talking past each other > here. All I wanted to say was that a separate kernel oom thread wouldn't > _help_ with the lock dependencies. > Oops. I misunderstood that you are skeptical about memory unmapping approach due to lock dependency. But rather, you are skeptical about use of a dedicated kernel thread for memory unmapping approach. > > The kthread itself need only take mmap_sem for read. If all > > threads sharing the mm with a victim have been SIGKILL'd, they should get > > TIF_MEMDIE set when reclaim fails and be able to allocate so that they can > > drop mmap_sem. > > which is the case if the direct oom context used trylock... > So just to make it clear. I am not objecting a specialized oom kernel > thread. It would work as well. I am just not convinced that it is really > needed because the direct oom context can use trylock and do the same > work directly. Well, I think it depends on from where we call memory unmapping code. The first candidate is oom_kill_process() because it is a location where the mm struct to unmap is determined. But since select_bad_process() aborts upon encountering a TIF_MEMDIE task, we will fail to call memory unmapping code again if the first down_trylock(&mm->mmap_sem) attempt in oom_kill_process() failed. (Here I assumed that we allow all OOM victims to access memory reserves so that subsequent down_trylock(&mm->mmap_sem) attempts could succeed.) The second candidate is select_bad_process() because it is a location where we can call memory unmapping code again upon encountering a TIF_MEMDIE task. The third candidate is caller of out_of_memory() because it is a location where we can call memory unmapping code again even when the OOM victims are blocked. (Our discussion seems to assume that TIF_MEMDIE tasks can make forward progress and die. But since TIF_MEMDIE tasks might encounter unkillable locks after returning from allocation (e.g. http://lkml.kernel.org/r/201509290118.BCJ43256.tSFFFMOLHVOJOQ@xxxxxxxxxxxxxxxxxxx ), it will be safer not to assume that out_of_memory() can be always called. So, I thought that a dedicated kernel thread makes it easy to call memory unmapping code periodically again and again. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>