+ mm-compaction-direct-freepage-allocation-for-async-direct-compaction.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm, compaction: direct freepage allocation for async direct compaction
has been added to the -mm tree.  Its filename is
     mm-compaction-direct-freepage-allocation-for-async-direct-compaction.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-compaction-direct-freepage-allocation-for-async-direct-compaction.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-compaction-direct-freepage-allocation-for-async-direct-compaction.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Vlastimil Babka <vbabka@xxxxxxx>
Subject: mm, compaction: direct freepage allocation for async direct compaction

The goal of direct compaction is to quickly make a high-order page
available for the pending allocation.  The free page scanner can add
significant latency when searching for migration targets, although to
succeed the compaction, the only important limit on the target free pages
is that they must not come from the same order-aligned block as the
migrated pages.

This patch therefore makes direct async compaction allocate freepages
directly from freelists.  Pages that do come from the same block (which we
cannot simply exclude from the freelist allocation) are put on separate
list and released only after migration to allow them to merge.

In addition to reduced stall, another advantage is that we split larger
free pages for migration targets only when smaller pages are depleted,
while the free scanner can split pages up to (order - 1) as it encouters
them.  However, this approach likely sacrifices some of the long-term
anti-fragmentation features of a thorough compaction, so we limit the
direct allocation approach to direct async compaction.

For observational purposes, the patch introduces two new counters to
/proc/vmstat.  compact_free_direct_alloc counts how many pages were
allocated directly without scanning, and compact_free_direct_miss counts
the subset of these allocations that were from the wrong range and had to
be held on the separate list.

Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Minchan Kim <minchan@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 include/linux/vm_event_item.h |    1 
 mm/compaction.c               |   52 +++++++++++++++++++++++++++++++-
 mm/internal.h                 |    5 +++
 mm/page_alloc.c               |   27 ++++++++++++++++
 mm/vmstat.c                   |    2 +
 5 files changed, 86 insertions(+), 1 deletion(-)

diff -puN include/linux/vm_event_item.h~mm-compaction-direct-freepage-allocation-for-async-direct-compaction include/linux/vm_event_item.h
--- a/include/linux/vm_event_item.h~mm-compaction-direct-freepage-allocation-for-async-direct-compaction
+++ a/include/linux/vm_event_item.h
@@ -51,6 +51,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
 #endif
 #ifdef CONFIG_COMPACTION
 		COMPACTMIGRATE_SCANNED, COMPACTFREE_SCANNED,
+		COMPACTFREE_DIRECT_ALLOC, COMPACTFREE_DIRECT_MISS,
 		COMPACTISOLATED,
 		COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS,
 		KCOMPACTD_WAKE,
diff -puN mm/compaction.c~mm-compaction-direct-freepage-allocation-for-async-direct-compaction mm/compaction.c
--- a/mm/compaction.c~mm-compaction-direct-freepage-allocation-for-async-direct-compaction
+++ a/mm/compaction.c
@@ -1088,6 +1088,41 @@ static void isolate_freepages(struct com
 	cc->free_pfn = isolate_start_pfn;
 }
 
+static void isolate_freepages_direct(struct compact_control *cc)
+{
+	unsigned long nr_pages;
+	unsigned long flags;
+
+	nr_pages = cc->nr_migratepages - cc->nr_freepages;
+
+	if (!compact_trylock_irqsave(&cc->zone->lock, &flags, cc))
+		return;
+
+	while (nr_pages) {
+		struct page *page;
+		unsigned long pfn;
+
+		page = alloc_pages_zone(cc->zone, 0, MIGRATE_MOVABLE);
+		if (!page)
+			break;
+		pfn = page_to_pfn(page);
+
+		count_compact_event(COMPACTFREE_DIRECT_ALLOC);
+
+		/* Is the free page in the block we are migrating from? */
+		if (pfn >> cc->order ==	(cc->migrate_pfn - 1) >> cc->order) {
+			list_add(&page->lru, &cc->freepages_held);
+			count_compact_event(COMPACTFREE_DIRECT_MISS);
+		} else {
+			list_add(&page->lru, &cc->freepages);
+			cc->nr_freepages++;
+			nr_pages--;
+		}
+	}
+
+	spin_unlock_irqrestore(&cc->zone->lock, flags);
+}
+
 /*
  * This is a migrate-callback that "allocates" freepages by taking pages
  * from the isolated freelists in the block we are migrating to.
@@ -1104,7 +1139,12 @@ static struct page *compaction_alloc(str
 	 * contention.
 	 */
 	if (list_empty(&cc->freepages)) {
-		if (!cc->contended)
+		if (cc->contended)
+			return NULL;
+
+		if (cc->direct_compaction && (cc->mode == MIGRATE_ASYNC))
+			isolate_freepages_direct(cc);
+		else
 			isolate_freepages(cc);
 
 		if (list_empty(&cc->freepages))
@@ -1480,6 +1520,10 @@ static int compact_zone(struct zone *zon
 						(cc->mode == MIGRATE_ASYNC)) {
 				cc->migrate_pfn = block_end_pfn(
 						cc->migrate_pfn - 1, cc->order);
+
+				if (!list_empty(&cc->freepages_held))
+					release_freepages(&cc->freepages_held);
+
 				/* Draining pcplists is useless in this case */
 				cc->last_migrated_pfn = 0;
 
@@ -1500,6 +1544,8 @@ check_drain:
 				block_start_pfn(cc->migrate_pfn, cc->order);
 
 			if (cc->last_migrated_pfn < current_block_start) {
+				if (!list_empty(&cc->freepages_held))
+					release_freepages(&cc->freepages_held);
 				cpu = get_cpu();
 				lru_add_drain_cpu(cpu);
 				drain_local_pages(zone);
@@ -1530,6 +1576,8 @@ out:
 		if (free_pfn > zone->compact_cached_free_pfn)
 			zone->compact_cached_free_pfn = free_pfn;
 	}
+	if (!list_empty(&cc->freepages_held))
+		release_freepages(&cc->freepages_held);
 
 	trace_mm_compaction_end(start_pfn, cc->migrate_pfn,
 				cc->free_pfn, end_pfn, sync, ret);
@@ -1558,6 +1606,7 @@ static unsigned long compact_zone_order(
 	};
 	INIT_LIST_HEAD(&cc.freepages);
 	INIT_LIST_HEAD(&cc.migratepages);
+	INIT_LIST_HEAD(&cc.freepages_held);
 
 	ret = compact_zone(zone, &cc);
 
@@ -1703,6 +1752,7 @@ static void __compact_pgdat(pg_data_t *p
 		cc->zone = zone;
 		INIT_LIST_HEAD(&cc->freepages);
 		INIT_LIST_HEAD(&cc->migratepages);
+		INIT_LIST_HEAD(&cc->freepages_held);
 
 		/*
 		 * When called via /proc/sys/vm/compact_memory
diff -puN mm/internal.h~mm-compaction-direct-freepage-allocation-for-async-direct-compaction mm/internal.h
--- a/mm/internal.h~mm-compaction-direct-freepage-allocation-for-async-direct-compaction
+++ a/mm/internal.h
@@ -145,6 +145,8 @@ static inline struct page *pageblock_pfn
 }
 
 extern int __isolate_free_page(struct page *page, unsigned int order);
+extern struct page * alloc_pages_zone(struct zone *zone, unsigned int order,
+							int migratetype);
 extern void __free_pages_bootmem(struct page *page, unsigned long pfn,
 					unsigned int order);
 extern void prep_compound_page(struct page *page, unsigned int order);
@@ -165,6 +167,9 @@ extern int user_min_free_kbytes;
 struct compact_control {
 	struct list_head freepages;	/* List of free pages to migrate to */
 	struct list_head migratepages;	/* List of pages being migrated */
+	struct list_head freepages_held;/* List of free pages from the block
+					 * that's being migrated
+					 */
 	unsigned long nr_freepages;	/* Number of isolated free pages */
 	unsigned long nr_migratepages;	/* Number of pages to migrate */
 	unsigned long free_pfn;		/* isolate_freepages search base */
diff -puN mm/page_alloc.c~mm-compaction-direct-freepage-allocation-for-async-direct-compaction mm/page_alloc.c
--- a/mm/page_alloc.c~mm-compaction-direct-freepage-allocation-for-async-direct-compaction
+++ a/mm/page_alloc.c
@@ -2342,6 +2342,33 @@ int split_free_page(struct page *page)
 }
 
 /*
+ * Like split_free_page, but given the zone, it will grab a free page from
+ * the freelists.
+ */
+struct page *
+alloc_pages_zone(struct zone *zone, unsigned int order, int migratetype)
+{
+	struct page *page;
+	unsigned long watermark;
+
+	watermark = low_wmark_pages(zone) + (1 << order);
+	if (!zone_watermark_ok(zone, 0, watermark, 0, 0))
+		return NULL;
+
+	page = __rmqueue(zone, order, migratetype);
+	if (!page)
+		return NULL;
+
+	__mod_zone_freepage_state(zone, -(1 << order),
+					  get_pcppage_migratetype(page));
+
+	set_page_owner(page, order, __GFP_MOVABLE);
+	set_page_refcounted(page);
+
+	return page;
+}
+
+/*
  * Allocate a page from the given zone. Use pcplists for order-0 allocations.
  */
 static inline
diff -puN mm/vmstat.c~mm-compaction-direct-freepage-allocation-for-async-direct-compaction mm/vmstat.c
--- a/mm/vmstat.c~mm-compaction-direct-freepage-allocation-for-async-direct-compaction
+++ a/mm/vmstat.c
@@ -822,6 +822,8 @@ const char * const vmstat_text[] = {
 #ifdef CONFIG_COMPACTION
 	"compact_migrate_scanned",
 	"compact_free_scanned",
+	"compact_free_direct_alloc",
+	"compact_free_direct_miss",
 	"compact_isolated",
 	"compact_stall",
 	"compact_fail",
_

Patches currently in -mm which might be from vbabka@xxxxxxx are

mm-compaction-wrap-calculating-first-and-last-pfn-of-pageblock.patch
mm-compaction-reduce-spurious-pcplist-drains.patch
mm-compaction-skip-blocks-where-isolation-fails-in-async-direct-compaction.patch
mm-compaction-direct-freepage-allocation-for-async-direct-compaction.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux