On Wed 25-11-15 02:51:38, David Rientjes wrote: > On Wed, 25 Nov 2015, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@xxxxxxxx> > > > > __GFP_NOFAIL is a big hammer used to ensure that the allocation > > request can never fail. This is a strong requirement and as such > > it also deserves a special treatment when the system is OOM. The > > primary problem here is that the allocation request might have > > come with some locks held and the oom victim might be blocked > > on the same locks. This is basically an OOM deadlock situation. > > > > This patch tries to reduce the risk of such a deadlocks by giving > > __GFP_NOFAIL allocations a special treatment and let them dive into > > memory reserves after oom killer invocation. This should help them > > to make a progress and release resources they are holding. The OOM > > victim should compensate for the reserves consumption. > > > > Suggested-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > > --- > > mm/page_alloc.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 8034909faad2..70db11c27046 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -2766,8 +2766,13 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, > > goto out; > > } > > /* Exhausted what can be done so it's blamo time */ > > - if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) > > + if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) { > > *did_some_progress = 1; > > + > > + if (gfp_mask & __GFP_NOFAIL) > > + page = get_page_from_freelist(gfp_mask, order, > > + ALLOC_NO_WATERMARKS|ALLOC_CPUSET, ac); > > + } > > out: > > mutex_unlock(&oom_lock); > > return page; > > I don't understand why you're setting ALLOC_CPUSET if you're giving them > "special treatment". If you want to allow access to memory reserves to > prevent an oom livelock, then why not also allow it access to allocate > outside its cpuset? Good question. My thinking was that __GFP_NOFAIL allocations might be done on behalf on a process so they are not necessarily system wide. We do the same before we actually go to out_of_memory. On the other hand __GFP_NOFAIL should be used really rarely and so breaking the cpuset restriction shouldn't be a big deal if that helps to break out from the potential OOM deadlock. I will drop it. Thanks! --- >From d89d17e72e5e1c03539f8c81fc6e120bccd2b460 Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@xxxxxxxx> Date: Tue, 23 Jun 2015 09:15:00 +0200 Subject: [PATCH] mm, oom: Give __GFP_NOFAIL allocations access to memory reserves __GFP_NOFAIL is a big hammer used to ensure that the allocation request can never fail. This is a strong requirement and as such it also deserves a special treatment when the system is OOM. The primary problem here is that the allocation request might have come with some locks held and the oom victim might be blocked on the same locks. This is basically an OOM deadlock situation. This patch tries to reduce the risk of such a deadlocks by giving __GFP_NOFAIL allocations a special treatment and let them dive into memory reserves after oom killer invocation. This should help them to make a progress and release resources they are holding. The OOM victim should compensate for the reserves consumption. [rientjes@xxxxxxxxxx: do not use ALLOC_CPUSET] Suggested-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> --- mm/page_alloc.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8034909faad2..94b04c1e894a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2766,8 +2766,13 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, goto out; } /* Exhausted what can be done so it's blamo time */ - if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) + if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) { *did_some_progress = 1; + + if (gfp_mask & __GFP_NOFAIL) + page = get_page_from_freelist(gfp_mask, order, + ALLOC_NO_WATERMARKS, ac); + } out: mutex_unlock(&oom_lock); return page; -- 2.6.2 -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>