On Thu 17-12-15 13:13:56, Andrew Morton wrote: [...] > Also, re-reading your description: > > : It has been shown (e.g. by Tetsuo Handa) that it is not that hard to > : construct workloads which break the core assumption mentioned above and > : the OOM victim might take unbounded amount of time to exit because it > : might be blocked in the uninterruptible state waiting for on an event > : (e.g. lock) which is blocked by another task looping in the page > : allocator. > > So the allocating task has done an oom-kill and is waiting for memory > to become available. The killed task is stuck on some lock, unable to > free memory. > > But the problematic lock will sometimes be the killed tasks's mmap_sem, > so the reaper won't reap anything. This scenario requires that the > mmap_sem is held for writing, which sounds like it will be uncommon. Yes, I have mentioned that in the changelog: " oom_reaper has to take mmap_sem on the target task for reading so the solution is not 100% because the semaphore might be held or blocked for write but the probability is reduced considerably wrt. basically any lock blocking forward progress as described above. " Another thing is to do is to change down_write(mmap_sem) to down_write_killable in most cases where we have a clear ENITR semantic. This is on my todo list. > hm. sigh. I hate the oom-killer. Just buy some more memory already! Tell me something about that... -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>