On Thu, Dec 04, 2014 at 06:30:45PM +1100, Christian Marie wrote: > On Wed, Dec 03, 2014 at 04:57:47PM +0900, Joonsoo Kim wrote: > > It'd be very helpful to get output of > > "trace_event=compaction:*,kmem:mm_page_alloc_extfrag" on the kernel > > with my tracepoint patches below. > > > > See following link. There is 3 patches. > > > > https://lkml.org/lkml/2014/12/3/71 > > I have just finished testing 3.18rc5 with both of the small patches mentioned > earlier in this thread and 2/3 of your event patches. The second patch > (https://lkml.org/lkml/2014/12/3/72) did not apply due to compaction_suitable > being different (am I missing another patch you are basing this off?). In fact, I'm using next-20141124 kernel, not just mainline one. There is a lot of fixes from Vlastimil and it may cause the applying failure. But, it's not that important in this case. I have gotten enough information about this problem on your below log. > > My compaction_suitable is: > > unsigned long compaction_suitable(struct zone *zone, int order) > > Results without that second event patch are as follows: > > Trace under heavy load but before any spiking system usage or significant > compaction spinning: > > http://ponies.io/raw/compaction_events/before.gz > > Trace during 100% cpu utilization, much of which was in system: > > http://ponies.io/raw/compaction_events/during.gz It looks that there is no stop condition in isolate_freepages(). In this period, your system have not enough freepage and many processes try to find freepage for compaction. Because there is no stop condition, they iterate almost all memory range every time. At the bottom of this mail, I attach one more fix although I don't test it yet. It will cause a lot of allocation failure that your network layer need. It is order 5 allocation request and with __GFP_NOWARN gfp flag, so I assume that there is no problem if allocation request is failed, but, I'm not sure. watermark check on this patch needs cc->classzone_idx, cc->alloc_flags that comes from Vlastimil's recent change. If you want to test it with 3.18rc5, please remove it. It doesn't much matter. Anyway, I hope it also helps you. > perf report at the time of during.gz: > > http://ponies.io/raw/compaction_events/perf.png By judging from this perf report, my second patch would have no impact to your system. I thought that this excessive cpu usage is started from the SLUB, but, order 5 kmalloc request is just forwarded to page allocator in current SLUB implementation, so patch 2 from me would not work on this problem. By the way, is it common that network layer needs order 5 allocation? IMHO, it'd be better to avoid this highorder request, because the kernel easily fail to handle this kind of request. Thanks. > > Interested to see what you make of the limited information. I may be able to > try all of your patches some time next week against whatever they apply cleanly > to. If that is needed. ------------>8----------------- >From b7daa232c327a4ebbb48ca0538a2dbf9ca83ca1f Mon Sep 17 00:00:00 2001 From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Date: Fri, 5 Dec 2014 09:38:30 +0900 Subject: [PATCH] mm/compaction: stop the compaction if there isn't enough freepage After compaction_suitable() passed, there is no check whether the system has enough memory to compact and blindly try to find freepage through iterating all memory range. This causes excessive cpu usage in low free memory condition and finally compaction would be failed. It makes sense that compaction would be stopped if there isn't enough freepage. So, this patch adds watermark check to isolate_freepages() in order to stop the compaction in this case. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> --- mm/compaction.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/compaction.c b/mm/compaction.c index e005620..31c4009 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -828,6 +828,7 @@ static void isolate_freepages(struct compact_control *cc) unsigned long low_pfn; /* lowest pfn scanner is able to scan */ int nr_freepages = cc->nr_freepages; struct list_head *freelist = &cc->freepages; + unsigned long watermark = low_wmark_pages(zone) + (2UL << cc->order); /* * Initialise the free scanner. The starting point is where we last @@ -903,6 +904,14 @@ static void isolate_freepages(struct compact_control *cc) */ if (cc->contended) break; + + /* + * Watermarks for order-0 must be met for compaction. + * See compaction_suitable for more detailed explanation. + */ + if (!zone_watermark_ok(zone, 0, watermark, + cc->classzone_idx, cc->alloc_flags)) + break; } /* split_free_page does not map the pages */ -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>