On 05/20/2014 01:37 AM, Andrew Morton wrote: > On Fri, 16 May 2014 11:47:53 +0200 Vlastimil Babka <vbabka@xxxxxxx> wrote: > >> Compaction uses compact_checklock_irqsave() function to periodically check for >> lock contention and need_resched() to either abort async compaction, or to >> free the lock, schedule and retake the lock. When aborting, cc->contended is >> set to signal the contended state to the caller. Two problems have been >> identified in this mechanism. >> >> First, compaction also calls directly cond_resched() in both scanners when no >> lock is yet taken. This call either does not abort async compaction, or set >> cc->contended appropriately. This patch introduces a new compact_should_abort() >> function to achieve both. In isolate_freepages(), the check frequency is >> reduced to once by SWAP_CLUSTER_MAX pageblocks to match what the migration >> scanner does in the preliminary page checks. In case a pageblock is found >> suitable for calling isolate_freepages_block(), the checks within there are >> done on higher frequency. >> >> Second, isolate_freepages() does not check if isolate_freepages_block() >> aborted due to contention, and advances to the next pageblock. This violates >> the principle of aborting on contention, and might result in pageblocks not >> being scanned completely, since the scanning cursor is advanced. This patch >> makes isolate_freepages_block() check the cc->contended flag and abort. >> >> In case isolate_freepages() has already isolated some pages before aborting >> due to contention, page migration will proceed, which is OK since we do not >> want to waste the work that has been done, and page migration has own checks >> for contention. However, we do not want another isolation attempt by either >> of the scanners, so cc->contended flag check is added also to >> compaction_alloc() and compact_finished() to make sure compaction is aborted >> right after the migration. > > What are the runtime effect of this change? > >> Reported-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > What did Joonsoo report? Perhaps this is the same thing.. Updated log message: Compaction uses compact_checklock_irqsave() function to periodically check for lock contention and need_resched() to either abort async compaction, or to free the lock, schedule and retake the lock. When aborting, cc->contended is set to signal the contended state to the caller. Two problems have been identified in this mechanism. First, compaction also calls directly cond_resched() in both scanners when no lock is yet taken. This call either does not abort async compaction, or set cc->contended appropriately. This patch introduces a new compact_should_abort() function to achieve both. In isolate_freepages(), the check frequency is reduced to once by SWAP_CLUSTER_MAX pageblocks to match what the migration scanner does in the preliminary page checks. In case a pageblock is found suitable for calling isolate_freepages_block(), the checks within there are done on higher frequency. Second, isolate_freepages() does not check if isolate_freepages_block() aborted due to contention, and advances to the next pageblock. This violates the principle of aborting on contention, and might result in pageblocks not being scanned completely, since the scanning cursor is advanced. This problem has been noticed in the code by Joonsoo Kim when reviewing related patches. This patch makes isolate_freepages_block() check the cc->contended flag and abort. In case isolate_freepages() has already isolated some pages before aborting due to contention, page migration will proceed, which is OK since we do not want to waste the work that has been done, and page migration has own checks for contention. However, we do not want another isolation attempt by either of the scanners, so cc->contended flag check is added also to compaction_alloc() and compact_finished() to make sure compaction is aborted right after the migration. The outcome of the patch should be reduced lock contention by async compaction and lower latencies for higher-order allocations where direct compaction is involved. >> >> ... >> >> @@ -718,9 +739,11 @@ static void isolate_freepages(struct zone *zone, >> /* >> * This can iterate a massively long zone without finding any >> * suitable migration targets, so periodically check if we need >> - * to schedule. >> + * to schedule, or even abort async compaction. >> */ >> - cond_resched(); >> + if (!(block_start_pfn % (SWAP_CLUSTER_MAX * pageblock_nr_pages)) >> + && compact_should_abort(cc)) > > This seems rather gratuitously inefficient and isn't terribly clear. > What's wrong with > > if ((++foo % SWAP_CLUSTER_MAX) == 0 && compact_should_abort(cc)) It's a new variable and it differs from how isolate_migratepages_range() does this. But yeah, I might change it later there as well. There it makes even more sense. E.g. when skipping whole pageblock there, pfn % SWAP_CLUSTER_MAX will be always zero so the periodicity varies. > ? > > (Assumes that SWAP_CLUSTER_MAX is power-of-2 and that the compiler will > use &) I hoped that compiler would be smart enough about SWAP_CLUSTER_MAX * pageblock_nr_pages as well, as those are constants and also power-of-2. But I didn't check the assembly. -----8<----- From: Vlastimil Babka <vbabka@xxxxxxx> Date: Wed, 21 May 2014 16:07:21 +0200 Subject: [PATCH] mm-compaction-properly-signal-and-act-upon-lock-and-need_sched-contention-fix Use a separate counter variable (not pfn) to trigger abort checks. Suggested-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> --- mm/compaction.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/compaction.c b/mm/compaction.c index bbe6a26..23c7439 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -779,6 +779,7 @@ static void isolate_freepages(struct zone *zone, unsigned long block_start_pfn; /* start of current pageblock */ unsigned long block_end_pfn; /* end of current pageblock */ unsigned long low_pfn; /* lowest pfn scanner is able to scan */ + unsigned long nr_blocks_scanned = 0; /* for periodical abort checks */ int nr_freepages = cc->nr_freepages; struct list_head *freelist = &cc->freepages; @@ -813,7 +814,7 @@ static void isolate_freepages(struct zone *zone, * suitable migration targets, so periodically check if we need * to schedule, or even abort async compaction. */ - if (!(block_start_pfn % (SWAP_CLUSTER_MAX * pageblock_nr_pages)) + if ((++nr_blocks_scanned % SWAP_CLUSTER_MAX) == 0 && compact_should_abort(cc)) break; -- 1.8.4.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>