On Wed 21-02-18 23:27:05, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Tue 20-02-18 22:32:56, Tetsuo Handa wrote: > > > >From c3b6616238fcd65d5a0fdabcb4577c7e6f40d35e Mon Sep 17 00:00:00 2001 > > > From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> > > > Date: Tue, 20 Feb 2018 11:07:23 +0900 > > > Subject: [PATCH] mm,page_alloc: wait for oom_lock than back off > > > > > > This patch fixes a bug which is essentially same with a bug fixed by > > > commit 400e22499dd92613 ("mm: don't warn about allocations which stall for > > > too long"). > > > > > > Currently __alloc_pages_may_oom() is using mutex_trylock(&oom_lock) based > > > on an assumption that the owner of oom_lock is making progress for us. But > > > it is possible to trigger OOM lockup when many threads concurrently called > > > __alloc_pages_slowpath() because all CPU resources are wasted for pointless > > > direct reclaim efforts. That is, schedule_timeout_uninterruptible(1) in > > > __alloc_pages_may_oom() does not always give enough CPU resource to the > > > owner of the oom_lock. > > > > > > It is possible that the owner of oom_lock is preempted by other threads. > > > Preemption makes the OOM situation much worse. But the page allocator is > > > not responsible about wasting CPU resource for something other than memory > > > allocation request. Wasting CPU resource for memory allocation request > > > without allowing the owner of oom_lock to make forward progress is a page > > > allocator's bug. > > > > > > Therefore, this patch changes to wait for oom_lock in order to guarantee > > > that no thread waiting for the owner of oom_lock to make forward progress > > > will not consume CPU resources for pointless direct reclaim efforts. > > > > So instead we will have many tasks sleeping on the lock and prevent the > > oom reaper to make any forward progress. This is not a solution without > > further steps. Also I would like to see a real life workload that would > > benefit from this. > > Of course I will propose follow-up patches. The patch in its current form will cause a worse behavior than we have currently, because pending oom waiters simply block the oom reaper. So I do not really see any reason to push this forward without other changes. So NAK to this patch in its current form. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>