On Tue, Dec 01, 2015 at 01:56:46PM +0100, Michal Hocko wrote: > From: Michal Hocko <mhocko@xxxxxxxx> > > wait_iff_congested has been used to throttle allocator before it retried > another round of direct reclaim to allow the writeback to make some > progress and prevent reclaim from looping over dirty/writeback pages > without making any progress. We used to do congestion_wait before > 0e093d99763e ("writeback: do not sleep on the congestion queue if > there are no congested BDIs or if significant congestion is not being > encountered in the current zone") but that led to undesirable stalls > and sleeping for the full timeout even when the BDI wasn't congested. > Hence wait_iff_congested was used instead. But it seems that even > wait_iff_congested doesn't work as expected. We might have a small file > LRU list with all pages dirty/writeback and yet the bdi is not congested > so this is just a cond_resched in the end and can end up triggering pre > mature OOM. > > This patch replaces the unconditional wait_iff_congested by > congestion_wait which is executed only if we _know_ that the last round > of direct reclaim didn't make any progress and dirty+writeback pages are > more than a half of the reclaimable pages on the zone which might be > usable for our target allocation. This shouldn't reintroduce stalls > fixed by 0e093d99763e because congestion_wait is called only when we > are getting hopeless when sleeping is a better choice than OOM with many > pages under IO. > > We have to preserve logic introduced by "mm, vmstat: allow WQ concurrency > to discover memory reclaim doesn't make any progress" into the > __alloc_pages_slowpath now that wait_iff_congested is not used anymore. > As the only remaining user of wait_iff_congested is shrink_inactive_list > we can remove the WQ specific short sleep from wait_iff_congested > because the sleep is needed to be done only once in the allocation retry > cycle. > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> Yep, this looks like the right thing to do. However, the code it adds to __alloc_pages_slowpath() is putting even more weight behind the argument that the reclaim retry logic should be in its own function. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>