On Fri 07-09-18 20:36:31, Tetsuo Handa wrote: > On 2018/09/07 20:10, Michal Hocko wrote: > >> I can't waste my time in what you think the long term solution. Please > >> don't refuse/ignore my (or David's) patches without your counter > >> patches. > > > > If you do not care about long term sanity of the code and if you do not > > care about a larger picture then I am not interested in any patches from > > you. MM code is far from trivial and no playground. This attitude of > > yours is just dangerous. > > > > Then, please explain how we guarantee that enough CPU resource is spent > between "exit_mmap() set MMF_OOM_SKIP" and "the OOM killer finds MMF_OOM_SKIP > was already set" so that last second allocation with high watermark can't fail > when 50% of available memory was already reclaimed. There is no guarantee. Full stop! This is an inherently racy land. We can strive to work reasonably well but this will never be perfect. And no, no timeout is going to solve it either. We have to live with the fact that sometimes we hit the race and kill an additional task. As long as there are no reasonable workloads which hit this race then we are good enough. The only guarantee we can talk about is the forward progress guarantee. If we know that exit_mmap is past the blocking point then we can hand over MMF_OOM_SKIP setting to the exit path rather than oom_reaper. Last moment (minute, milisecond, nanosecond for that matter) allocation is in no way related or solveable without a strong locking and we have learned this is not a good idea in the past. This is nothing new though. This discussion is not moving forward. It just burns time so this is my last email in this thread. -- Michal Hocko SUSE Labs