Hi Andrew, Mel, On Thu, Jun 2, 2016 at 8:43 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > On Thu, 2 Jun 2016 13:19:36 +0100 Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote: >> > >Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> >> > >> > Acked-by: Vlastimil Babka <vbabka@xxxxxxx> >> > >> >> Thanks. > > I queued this. A tested-by:Geert would be nice? > > > From: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> > Subject: mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies > > The optimistic fast path may use cpuset_current_mems_allowed instead of of > a NULL nodemask supplied by the caller for cpuset allocations. The > preferred zone is calculated on this basis for statistic purposes and as a > starting point in the zonelist iterator. > > However, if the context can ignore memory policies due to being atomic or > being able to ignore watermarks then the starting point in the zonelist > iterator is no longer correct. This patch resets the zonelist iterator in > the allocator slowpath if the context can ignore memory policies. This > will alter the zone used for statistics but only after it is known that it > makes sense for that context. Resetting it before entering the slowpath > would potentially allow an ALLOC_CPUSET allocation to be accounted for > against the wrong zone. Note that while nodemask is not explicitly set to > the original nodemask, it would only have been overwritten if > cpuset_enabled() and it was reset before the slowpath was entered. > > Link: http://lkml.kernel.org/r/20160602103936.GU2527@xxxxxxxxxxxxxxxxxxx > Fixes: c33d6c06f60f710 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") My understanding was that this was an an additional patch, not fixing the problem in-se? Indeed, after applying this patch (without the other one that added "z = ac->preferred_zoneref;" to the reset_fair block of get_page_from_freelist()) I still get crashes... Now testing with both applied... > Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> > Reported-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> > Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > mm/page_alloc.c | 23 ++++++++++++++++------- > 1 file changed, 16 insertions(+), 7 deletions(-) > > diff -puN mm/page_alloc.c~mm-page_alloc-recalculate-the-preferred-zoneref-if-the-context-can-ignore-memory-policies mm/page_alloc.c > --- a/mm/page_alloc.c~mm-page_alloc-recalculate-the-preferred-zoneref-if-the-context-can-ignore-memory-policies > +++ a/mm/page_alloc.c > @@ -3604,6 +3604,17 @@ retry: > */ > alloc_flags = gfp_to_alloc_flags(gfp_mask); > > + /* > + * Reset the zonelist iterators if memory policies can be ignored. > + * These allocations are high priority and system rather than user > + * orientated. > + */ > + if ((alloc_flags & ALLOC_NO_WATERMARKS) || !(alloc_flags & ALLOC_CPUSET)) { > + ac->zonelist = node_zonelist(numa_node_id(), gfp_mask); > + ac->preferred_zoneref = first_zones_zonelist(ac->zonelist, > + ac->high_zoneidx, ac->nodemask); > + } > + > /* This is the last chance, in general, before the goto nopage. */ > page = get_page_from_freelist(gfp_mask, order, > alloc_flags & ~ALLOC_NO_WATERMARKS, ac); > @@ -3612,12 +3623,6 @@ retry: > > /* Allocate without watermarks if the context allows */ > if (alloc_flags & ALLOC_NO_WATERMARKS) { > - /* > - * Ignore mempolicies if ALLOC_NO_WATERMARKS on the grounds > - * the allocation is high priority and these type of > - * allocations are system rather than user orientated > - */ > - ac->zonelist = node_zonelist(numa_node_id(), gfp_mask); > page = get_page_from_freelist(gfp_mask, order, > ALLOC_NO_WATERMARKS, ac); > if (page) > @@ -3816,7 +3821,11 @@ retry_cpuset: > /* Dirty zone balancing only done in the fast path */ > ac.spread_dirty_pages = (gfp_mask & __GFP_WRITE); > > - /* The preferred zone is used for statistics later */ > + /* > + * The preferred zone is used for statistics but crucially it is > + * also used as the starting point for the zonelist iterator. It > + * may get reset for allocations that ignore memory policies. > + */ > ac.preferred_zoneref = first_zones_zonelist(ac.zonelist, > ac.high_zoneidx, ac.nodemask); > if (!ac.preferred_zoneref) { > _ > -- Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>