On 2018/06/05 17:57, Michal Hocko wrote: >> For this reason, we see testing harnesses often oom killed immediately >> after running a unittest that stresses reclaim or compaction by inducing a >> system-wide oom condition. The harness spawns the unittest which spawns >> an antagonist memory hog that is intended to be oom killed. When memory >> is mlocked or there are a large number of threads faulting memory for the >> antagonist, the unittest and the harness itself get oom killed because the >> oom reaper sets MMF_OOM_SKIP; this ends up happening a lot on powerpc. >> The memory hog has mm->mmap_sem readers queued ahead of a writer that is >> doing mmap() so the oom reaper can't grab the sem quickly enough. > > How come the writer doesn't back off. mmap paths should be taking an > exclusive mmap sem in killable sleep so it should back off. Or is the > holder of the lock deep inside mmap path doing something else and not > backing out with the exclusive lock held? > Here is an example where the writer doesn't back off. http://lkml.kernel.org/r/20180607150546.1c7db21f70221008e14b8bb8@xxxxxxxxxxxxxxxxxxxx down_write_killable(&mm->mmap_sem) is nothing but increasing the possibility of successfully back off. There is no guarantee that the owner of that exclusive mmap sem will not be blocked by other unkillable waits.