Andrew Morton wrote: > On Fri, 20 Feb 2015 22:20:00 -0500 "Theodore Ts'o" <tytso@xxxxxxx> wrote: > > > +akpm > > I was hoping not to have to read this thread ;) Sorry for getting so complicated. > What I'm not really understanding is why the pre-3.19 implementation > actually worked. We've exhausted the free pages, we're not succeeding > at reclaiming anything, we aren't able to oom-kill anyone. Yet it > *does* work - we eventually find that memory and everything proceeds. > > How come? Where did that memory come from? > Even without __GFP_NOFAIL, GFP_NOFS / GFP_NOIO allocations retried forever (without invoking the OOM killer) if order <= PAGE_ALLOC_COSTLY_ORDER and TIF_MEMDIE is not set. Somebody else volunteered that memory while retrying. This implies silent hang-up forever if nobody volunteers memory. > And yes, I agree that sites such as xfs's kmem_alloc() should be > passing __GFP_NOFAIL to tell the page allocator what's going on. I > don't think it matters a lot whether kmem_alloc() retains its retry > loop. If __GFP_NOFAIL is working correctly then it will never loop > anyway... Commit 9879de7373fc ("mm: page_alloc: embed OOM killing naturally into allocation slowpath") inadvertently changed GFP_NOFS / GFP_NOIO allocations not to retry unless __GFP_NOFAIL is specified. Therefore, either applying Johannes's akpm-doesnt-know-why-it-works patch or passing __GFP_NOFAIL will restore the pre-3.19 behavior (with possibility of silent hang-up). _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs