Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 22-11-16 17:38:01, Greg KH wrote:
> On Tue, Nov 22, 2016 at 05:14:02PM +0100, Vlastimil Babka wrote:
> > On 11/22/2016 05:06 PM, Marc MERLIN wrote:
> > > On Mon, Nov 21, 2016 at 01:56:39PM -0800, Marc MERLIN wrote:
> > >> On Mon, Nov 21, 2016 at 10:50:20PM +0100, Vlastimil Babka wrote:
> > >>>> 4.9rc5 however seems to be doing better, and is still running after 18
> > >>>> hours. However, I got a few page allocation failures as per below, but the
> > >>>> system seems to recover.
> > >>>> Vlastimil, do you want me to continue the copy on 4.9 (may take 3-5 days) 
> > >>>> or is that good enough, and i should go back to 4.8.8 with that patch applied?
> > >>>> https://marc.info/?l=linux-mm&m=147423605024993
> > >>>
> > >>> Hi, I think it's enough for 4.9 for now and I would appreciate trying
> > >>> 4.8 with that patch, yeah.
> > >>
> > >> So the good news is that it's been running for almost 5H and so far so good.
> > > 
> > > And the better news is that the copy is still going strong, 4.4TB and
> > > going. So 4.8.8 is fixed with that one single patch as far as I'm
> > > concerned.
> > > 
> > > So thanks for that, looks good to me to merge.
> > 
> > Thanks a lot for the testing. So what do we do now about 4.8? (4.7 is
> > already EOL AFAICS).
> > 
> > - send the patch [1] as 4.8-only stable. Greg won't like that, I expect.
> >   - alternatively a simpler (againm 4.8-only) patch that just outright
> > prevents OOM for 0 < order < costly, as Michal already suggested.
> > - backport 10+ compaction patches to 4.8 stable
> > - something else?
> 
> Just wait for 4.8-stable to go end-of-life in a few weeks after 4.9 is
> released?  :)

OK, so can we push this through to 4.8 before EOL and make sure there
won't be any additional pre-mature high order OOM reports? The patch
should be simple enough and safe for the stable tree. There is no
upstream commit because 4.9 is fixed in a different way which would be
way too intrusive for the stable backport.
--- 
>From 02306e8d593fa8a48d620e0c9d63a934ca8366d8 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@xxxxxxxx>
Date: Wed, 23 Nov 2016 07:26:30 +0100
Subject: [PATCH] mm, oom: stop pre-mature high-order OOM killer invocations

31e49bfda184 ("mm, oom: protect !costly allocations some more for
!CONFIG_COMPACTION") was an attempt to reduce chances of pre-mature OOM
killer invocation for high order requests. It seemed to work for most
users just fine but it is far from bullet proof and obviously not
sufficient for Marc who has reported pre-mature OOM killer invocations
with 4.8 based kernels. 4.9 will all the compaction improvements seems
to be behaving much better but that would be too intrusive to backport
to 4.8 stable kernels. Instead this patch simply never declares OOM for
!costly high order requests. We rely on order-0 requests to do that in
case we are really out of memory. Order-0 requests are much more common
and so a risk of a livelock without any way forward is highly unlikely.

Reported-by: Marc MERLIN <marc@xxxxxxxxxxx>
Tested-by: Marc MERLIN <marc@xxxxxxxxxxx>
Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
---
 mm/page_alloc.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a2214c64ed3c..7401e996009a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3161,6 +3161,16 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
 	if (!order || order > PAGE_ALLOC_COSTLY_ORDER)
 		return false;
 
+#ifdef CONFIG_COMPACTION
+	/*
+	 * This is a gross workaround to compensate a lack of reliable compaction
+	 * operation. We cannot simply go OOM with the current state of the compaction
+	 * code because this can lead to pre mature OOM declaration.
+	 */
+	if (order <= PAGE_ALLOC_COSTLY_ORDER)
+		return true;
+#endif
+
 	/*
 	 * There are setups with compaction disabled which would prefer to loop
 	 * inside the allocator rather than hit the oom killer prematurely.
-- 
2.10.2

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]