On Wed 25-11-15 12:57:08, David Rientjes wrote: > On Wed, 25 Nov 2015, Michal Hocko wrote: > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 8034909faad2..94b04c1e894a 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -2766,8 +2766,13 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, > > goto out; > > } > > /* Exhausted what can be done so it's blamo time */ > > - if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) > > + if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) { > > *did_some_progress = 1; > > + > > + if (gfp_mask & __GFP_NOFAIL) > > + page = get_page_from_freelist(gfp_mask, order, > > + ALLOC_NO_WATERMARKS, ac); > > + } > > out: > > mutex_unlock(&oom_lock); > > return page; > > Well, sure, that's one way to do it, but for cpuset users, wouldn't this > lead to a depletion of the first system zone since you've dropped > ALLOC_CPUSET and are doing ALLOC_NO_WATERMARKS in the same call? Are you suggesting to do? if (gfp_mask & __GFP_NOFAIL) { page = get_page_from_freelist(gfp_mask, order, ALLOC_NO_WATERMARKS|ALLOC_CPUSET, ac); /* * fallback to ignore cpuset if our nodes are * depleted */ if (!page) get_page_from_freelist(gfp_mask, order, ALLOC_NO_WATERMARKS, ac); } I am not really sure this worth complication. __GFP_NOFAIL should be relatively rare and nodes are rarely depeleted so much that ALLOC_NO_WATERMARKS wouldn't be able to allocate from the first zone in the zone list. I mean I have no problem to do the above it just sounds overcomplicating the situation without making practical difference. If you and others insist I can resping the patch though. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>