Re: isolate_freepages_block and excessive CPU usage by OSD process

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 04, 2014 at 06:30:45PM +1100, Christian Marie wrote:
> On Wed, Dec 03, 2014 at 04:57:47PM +0900, Joonsoo Kim wrote:
> > It'd be very helpful to get output of
> > "trace_event=compaction:*,kmem:mm_page_alloc_extfrag" on the kernel
> > with my tracepoint patches below.
> > 
> > See following link. There is 3 patches.
> > 
> > https://lkml.org/lkml/2014/12/3/71
> 
> I have just finished testing 3.18rc5 with both of the small patches mentioned
> earlier in this thread and 2/3 of your event patches. The second patch
> (https://lkml.org/lkml/2014/12/3/72) did not apply due to compaction_suitable
> being different (am I missing another patch you are basing this off?).

In fact, I'm using next-20141124 kernel, not just mainline one. There
is a lot of fixes from Vlastimil and it may cause the applying failure.
But, it's not that important in this case. I have gotten enough information
about this problem on your below log.

> 
> My compaction_suitable is:
> 
> 	unsigned long compaction_suitable(struct zone *zone, int order)
> 
> Results without that second event patch are as follows:
> 
> Trace under heavy load but before any spiking system usage or significant
> compaction spinning:
> 
> http://ponies.io/raw/compaction_events/before.gz
> 
> Trace during 100% cpu utilization, much of which was in system:
> 
> http://ponies.io/raw/compaction_events/during.gz

It looks that there is no stop condition in isolate_freepages(). In
this period, your system have not enough freepage and many processes
try to find freepage for compaction. Because there is no stop
condition, they iterate almost all memory range every time. At the
bottom of this mail, I attach one more fix although I don't test it
yet. It will cause a lot of allocation failure that your network layer
need. It is order 5 allocation request and with __GFP_NOWARN gfp flag,
so I assume that there is no problem if allocation request is failed,
but, I'm not sure.

watermark check on this patch needs cc->classzone_idx, cc->alloc_flags
that comes from Vlastimil's recent change. If you want to test it with
3.18rc5, please remove it. It doesn't much matter.

Anyway, I hope it also helps you.

> perf report at the time of during.gz:
> 
> http://ponies.io/raw/compaction_events/perf.png

By judging from this perf report, my second patch would have no impact
to your system. I thought that this excessive cpu usage is started from
the SLUB, but, order 5 kmalloc request is just forwarded to page
allocator in current SLUB implementation, so patch 2 from me would not
work on this problem.

By the way, is it common that network layer needs order 5 allocation?
IMHO, it'd be better to avoid this highorder request, because the kernel
easily fail to handle this kind of request.

Thanks.

> 
> Interested to see what you make of the limited information. I may be able to
> try all of your patches some time next week against whatever they apply cleanly
> to. If that is needed.

------------>8-----------------
>From b7daa232c327a4ebbb48ca0538a2dbf9ca83ca1f Mon Sep 17 00:00:00 2001
From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
Date: Fri, 5 Dec 2014 09:38:30 +0900
Subject: [PATCH] mm/compaction: stop the compaction if there isn't enough
 freepage

After compaction_suitable() passed, there is no check whether the system
has enough memory to compact and blindly try to find freepage through
iterating all memory range. This causes excessive cpu usage in low free
memory condition and finally compaction would be failed. It makes sense
that compaction would be stopped if there isn't enough freepage. So,
this patch adds watermark check to isolate_freepages() in order to stop
the compaction in this case.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
---
 mm/compaction.c |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/compaction.c b/mm/compaction.c
index e005620..31c4009 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -828,6 +828,7 @@ static void isolate_freepages(struct compact_control *cc)
 	unsigned long low_pfn;	     /* lowest pfn scanner is able to scan */
 	int nr_freepages = cc->nr_freepages;
 	struct list_head *freelist = &cc->freepages;
+	unsigned long watermark = low_wmark_pages(zone) + (2UL << cc->order);
 
 	/*
 	 * Initialise the free scanner. The starting point is where we last
@@ -903,6 +904,14 @@ static void isolate_freepages(struct compact_control *cc)
 		 */
 		if (cc->contended)
 			break;
+
+		/*
+		 * Watermarks for order-0 must be met for compaction.
+		 * See compaction_suitable for more detailed explanation.
+		 */
+		if (!zone_watermark_ok(zone, 0, watermark,
+			cc->classzone_idx, cc->alloc_flags))
+			break;
 	}
 
 	/* split_free_page does not map the pages */
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]