On Wed 13-06-18 22:20:49, Tetsuo Handa wrote: > On 2018/06/05 17:57, Michal Hocko wrote: > >> For this reason, we see testing harnesses often oom killed immediately > >> after running a unittest that stresses reclaim or compaction by inducing a > >> system-wide oom condition. The harness spawns the unittest which spawns > >> an antagonist memory hog that is intended to be oom killed. When memory > >> is mlocked or there are a large number of threads faulting memory for the > >> antagonist, the unittest and the harness itself get oom killed because the > >> oom reaper sets MMF_OOM_SKIP; this ends up happening a lot on powerpc. > >> The memory hog has mm->mmap_sem readers queued ahead of a writer that is > >> doing mmap() so the oom reaper can't grab the sem quickly enough. > > > > How come the writer doesn't back off. mmap paths should be taking an > > exclusive mmap sem in killable sleep so it should back off. Or is the > > holder of the lock deep inside mmap path doing something else and not > > backing out with the exclusive lock held? > > > > Here is an example where the writer doesn't back off. > > http://lkml.kernel.org/r/20180607150546.1c7db21f70221008e14b8bb8@xxxxxxxxxxxxxxxxxxxx > > down_write_killable(&mm->mmap_sem) is nothing but increasing the possibility of > successfully back off. There is no guarantee that the owner of that exclusive > mmap sem will not be blocked by other unkillable waits. but we are talking about mmap() path here. Sure there are other paths which might need a back off while the lock is held and that should be addressed if possible but this is not really related to what David wrote above and I tried to understand. -- Michal Hocko SUSE Labs