On Wed, 17 Dec 2014 16:51:21 -0800 (PST) David Rientjes <rientjes@xxxxxxxxxx> wrote: > > > The page allocator slowpath is always called from the fastpath if the > > > first allocation didn't succeed, so we don't know from which we allocated > > > the page at this tracepoint. > > > > True, but the idea is that when we call trace_mm_page_alloc(), local > > var `mask' holds the gfp_t which was used in the most recent allocation > > attempt. > > > > So if the fastpath succeeds, which should be the majority of the time, > then we get a tracepoint here that says we allocated with > __GFP_FS | __GFP_IO even though we may have PF_MEMALLOC_NOIO set. So if > page != NULL, we can know that either the fastpath succeeded or we don't > have PF_MEMALLOC_NOIO and were allowed to reclaim. Not sure that's very > helpful. > > Easiest thing to do would be to just clear __GFP_FS and __GFP_IO when we > clear everything not in gfp_allowed_mask, but that's pointless if the > fastpath succeeds. I'm not sure it's worth to restructure the code with a > possible performance overhead for the benefit of a tracepoint. > > And then there's the call to lockdep_trace_alloc() which does care about > __GFP_FS. That looks broken because we need to clear __GFP_FS with > PF_MEMALLOC_NOIO. <head spinning> I'm not particuarly concerned about the tracepoint and we can change that later. The main intent here is to restore the allocation mask when __alloc_pages_nodemask() does the "goto retry_cpuset". (I renamed `mask' to `alloc_mask' and documented it a bit) From: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Subject: mm/page_alloc.c:__alloc_pages_nodemask(): don't alter arg gfp_mask __alloc_pages_nodemask() strips __GFP_IO when retrying the page allocation. But it does this by altering the function-wide variable gfp_mask. This will cause subsequent allocation attempts to inadvertently use the modified gfp_mask. Also, pass the correct mask (the mask we actually used) into trace_mm_page_alloc(). Cc: Ming Lei <ming.lei@xxxxxxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/page_alloc.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff -puN mm/page_alloc.c~mm-page_allocc-__alloc_pages_nodemask-dont-alter-arg-gfp_mask mm/page_alloc.c --- a/mm/page_alloc.c~mm-page_allocc-__alloc_pages_nodemask-dont-alter-arg-gfp_mask +++ a/mm/page_alloc.c @@ -2865,6 +2865,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, u unsigned int cpuset_mems_cookie; int alloc_flags = ALLOC_WMARK_LOW|ALLOC_CPUSET|ALLOC_FAIR; int classzone_idx; + gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */ gfp_mask &= gfp_allowed_mask; @@ -2898,22 +2899,24 @@ retry_cpuset: classzone_idx = zonelist_zone_idx(preferred_zoneref); /* First allocation attempt */ - page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, nodemask, order, - zonelist, high_zoneidx, alloc_flags, - preferred_zone, classzone_idx, migratetype); + alloc_mask = gfp_mask|__GFP_HARDWALL; + page = get_page_from_freelist(alloc_mask, nodemask, order, zonelist, + high_zoneidx, alloc_flags, preferred_zone, + classzone_idx, migratetype); if (unlikely(!page)) { /* * Runtime PM, block IO and its error handling path * can deadlock because I/O on the device might not * complete. */ - gfp_mask = memalloc_noio_flags(gfp_mask); - page = __alloc_pages_slowpath(gfp_mask, order, + alloc_mask = memalloc_noio_flags(gfp_mask); + + page = __alloc_pages_slowpath(alloc_mask, order, zonelist, high_zoneidx, nodemask, preferred_zone, classzone_idx, migratetype); } - trace_mm_page_alloc(page, order, gfp_mask, migratetype); + trace_mm_page_alloc(page, order, alloc_mask, migratetype); out: /* _ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>