On Tue, Dec 16, 2014 at 05:07:16PM -0800, David Rientjes wrote: > On Mon, 15 Dec 2014, Johannes Weiner wrote: > > /* Check if we should retry the allocation */ > > pages_reclaimed += did_some_progress; > > if (should_alloc_retry(gfp_mask, order, did_some_progress, > > pages_reclaimed)) { > > + /* > > + * If we fail to make progress by freeing individual > > + * pages, but the allocation wants us to keep going, > > + * start OOM killing tasks. > > + */ > > + if (!did_some_progress) { > > + page = __alloc_pages_may_oom(gfp_mask, order, zonelist, > > + high_zoneidx, nodemask, > > + preferred_zone, classzone_idx, > > + migratetype,&did_some_progress); > > Missing a space. That was because of the 80 character limit, it seemed preferrable over a linewrap. > > + if (page) > > + goto got_pg; > > + if (!did_some_progress) > > + goto nopage; > > + } > > /* Wait for some write requests to complete then retry */ > > wait_iff_congested(preferred_zone, BLK_RW_ASYNC, HZ/50); > > goto rebalance; > > This is broken because it does not recall gfp_to_alloc_flags(). If > current is the oom kill victim, then ALLOC_NO_WATERMARKS never gets set > properly and the slowpath will end up looping forever. The "restart" > label which was removed in this patch needs to be reintroduced, and it can > probably be moved to directly before gfp_to_alloc_flags(). Thanks for catching this. gfp_to_alloc_flags()'s name doesn't exactly imply such side effects... Here is a fixlet on top: --- >From 45362d1920340716ef58bf1024d9674b5dfa809e Mon Sep 17 00:00:00 2001 From: Johannes Weiner <hannes@xxxxxxxxxxx> Date: Tue, 16 Dec 2014 21:04:24 -0500 Subject: [patch] mm: page_alloc: embed OOM killing naturally into allocation slowpath fix When retrying the allocation after potentially invoking OOM, make sure the alloc flags are recalculated, as they have to consider TIF_MEMDIE. Restore the original restart label, but rename it to 'retry' to match the should_alloc_retry() it depends on. Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> --- mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 83ec725aec36..e8f5997c557c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2673,6 +2673,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, (gfp_mask & GFP_THISNODE) == GFP_THISNODE) goto nopage; +retry: if (!(gfp_mask & __GFP_NO_KSWAPD)) wake_all_kswapds(order, zonelist, high_zoneidx, preferred_zone, nodemask); @@ -2695,7 +2696,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, classzone_idx = zonelist_zone_idx(preferred_zoneref); } -rebalance: /* This is the last chance, in general, before the goto nopage. */ page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist, high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS, @@ -2823,7 +2823,7 @@ rebalance: } /* Wait for some write requests to complete then retry */ wait_iff_congested(preferred_zone, BLK_RW_ASYNC, HZ/50); - goto rebalance; + goto retry; } else { /* * High-order allocations do not necessarily loop after -- 2.1.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>