Re: [patch 2/2] mm, compaction: persistently skip hugetlbfs pageblocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/23/2017 10:41 AM, Vlastimil Babka wrote:
> On 08/16/2017 01:39 AM, David Rientjes wrote:
>> It is pointless to migrate hugetlb memory as part of memory compaction if
>> the hugetlb size is equal to the pageblock order.  No defragmentation is
>> occurring in this condition.
>>
>> It is also pointless to for the freeing scanner to scan a pageblock where
>> a hugetlb page is pinned.  Unconditionally skip these pageblocks, and do
>> so peristently so that they are not rescanned until it is observed that
>> these hugepages are no longer pinned.
>>
>> It would also be possible to do this by involving the hugetlb subsystem
>> in marking pageblocks to no longer be skipped when they hugetlb pages are
>> freed.  This is a simple solution that doesn't involve any additional
>> subsystems in pageblock skip manipulation.
>>
>> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
>> ---
>>  mm/compaction.c | 48 +++++++++++++++++++++++++++++++++++++-----------
>>  1 file changed, 37 insertions(+), 11 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -217,6 +217,20 @@ static void reset_cached_positions(struct zone *zone)
>>  				pageblock_start_pfn(zone_end_pfn(zone) - 1);
>>  }
>>  
>> +/*
>> + * Hugetlbfs pages should consistenly be skipped until updated by the hugetlb
>> + * subsystem.  It is always pointless to compact pages of pageblock_order and
>> + * the free scanner can reconsider when no longer huge.
>> + */
>> +static bool pageblock_skip_persistent(struct page *page, unsigned int order)
>> +{
>> +	if (!PageHuge(page))
>> +		return false;
>> +	if (order != pageblock_order)
>> +		return false;
>> +	return true;
> 
> Why just HugeTLBfs? There's also no point in migrating/finding free
> pages in THPs. Actually, any compound page of pageblock order?
> 
>> +}
>> +
>>  /*
>>   * This function is called to clear all cached information on pageblocks that
>>   * should be skipped for page isolation when the migrate and free page scanner
>> @@ -241,6 +255,8 @@ static void __reset_isolation_suitable(struct zone *zone)
>>  			continue;
>>  		if (zone != page_zone(page))
>>  			continue;
>> +		if (pageblock_skip_persistent(page, compound_order(page)))
>> +			continue;
> 
> I like the idea of how persistency is achieved by rechecking in the reset.
> 
>>  
>>  		clear_pageblock_skip(page);
>>  	}
>> @@ -448,13 +464,15 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>>  		 * and the only danger is skipping too much.
>>  		 */
>>  		if (PageCompound(page)) {
>> -			unsigned int comp_order = compound_order(page);
>> -
>> -			if (likely(comp_order < MAX_ORDER)) {
>> -				blockpfn += (1UL << comp_order) - 1;
>> -				cursor += (1UL << comp_order) - 1;
>> +			const unsigned int order = compound_order(page);
>> +
>> +			if (pageblock_skip_persistent(page, order)) {
>> +				set_pageblock_skip(page);
>> +				blockpfn = end_pfn;
>> +			} else if (likely(order < MAX_ORDER)) {
>> +				blockpfn += (1UL << order) - 1;
>> +				cursor += (1UL << order) - 1;
>>  			}
> 
> Is this new code (and below) really necessary? The existing code should
> already lead to skip bit being set via update_pageblock_skip()?
 
Ok, here's a patch implementing my suggestions.

----8<----
>From 83dfe045e0cd23f18a67c56b5bd848337610dfff Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@xxxxxxx>
Date: Fri, 1 Sep 2017 14:07:41 +0200
Subject: [PATCH] mm, compaction: extend pageblock_skip_persistent() to all
 compound pages

The pageblock_skip_persistent() function checks for HugeTLB pages of pageblock
order. When clearing pageblock skip bits for compaction, the bits are not
cleared for such pageblocks, because they cannot contain base pages suitable
for migration, nor free pages to use as migration targets.

This optimization can be simply extended to all compound pages of order equal
or larger than pageblock order, because migrating such pages (if they support
it) cannot help sub-pageblock fragmentation. This includes THP's and also
gigantic HugeTLB pages, which the current implementation doesn't persistently
skip due to a strict pageblock_order equality check and not recognizing tail
pages.

Additionally, this patch removes the pageblock_skip_persistent() calls from
migration and free scanner, since the generic compound page treatment together
with update_pageblock_skip() call will also lead to pageblocks starting with a
large enough compound page being immediately marked for skipping, which then
becomes persistent.

Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
---
 mm/compaction.c | 43 +++++++++++++++++--------------------------
 1 file changed, 17 insertions(+), 26 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index f91d64cb94ac..11be786af74f 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -218,17 +218,21 @@ static void reset_cached_positions(struct zone *zone)
 }
 
 /*
- * Hugetlbfs pages should consistenly be skipped until updated by the hugetlb
- * subsystem.  It is always pointless to compact pages of pageblock_order and
- * the free scanner can reconsider when no longer huge.
+ * Compound pages of >= pageblock_order should consistenly be skipped until
+ * released. It is always pointless to compact pages of such order (if they are
+ * migratable), and the pageblocks they occupy cannot contain any free pages.
  */
-static bool pageblock_skip_persistent(struct page *page, unsigned int order)
+static bool pageblock_skip_persistent(struct page *page)
 {
-	if (!PageHuge(page))
+	if (!PageCompound(page))
 		return false;
-	if (order != pageblock_order)
-		return false;
-	return true;
+
+	page = compound_head(page);
+
+	if (compound_order(page) >= pageblock_order)
+		return true;
+
+	return false;
 }
 
 /*
@@ -255,7 +259,7 @@ static void __reset_isolation_suitable(struct zone *zone)
 			continue;
 		if (zone != page_zone(page))
 			continue;
-		if (pageblock_skip_persistent(page, compound_order(page)))
+		if (pageblock_skip_persistent(page))
 			continue;
 
 		clear_pageblock_skip(page);
@@ -322,8 +326,7 @@ static inline bool isolation_suitable(struct compact_control *cc,
 	return true;
 }
 
-static inline bool pageblock_skip_persistent(struct page *page,
-					     unsigned int order)
+static inline bool pageblock_skip_persistent(struct page *page)
 {
 	return false;
 }
@@ -472,10 +475,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
 		if (PageCompound(page)) {
 			const unsigned int order = compound_order(page);
 
-			if (pageblock_skip_persistent(page, order)) {
-				set_pageblock_skip(page);
-				blockpfn = end_pfn;
-			} else if (likely(order < MAX_ORDER)) {
+			if (likely(order < MAX_ORDER)) {
 				blockpfn += (1UL << order) - 1;
 				cursor += (1UL << order) - 1;
 			}
@@ -797,10 +797,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		if (PageCompound(page)) {
 			const unsigned int order = compound_order(page);
 
-			if (pageblock_skip_persistent(page, order)) {
-				set_pageblock_skip(page);
-				low_pfn = end_pfn;
-			} else if (likely(order < MAX_ORDER))
+			if (likely(order < MAX_ORDER))
 				low_pfn += (1UL << order) - 1;
 			goto isolate_fail;
 		}
@@ -863,13 +860,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 			 * is safe to read and it's 0 for tail pages.
 			 */
 			if (unlikely(PageCompound(page))) {
-				const unsigned int order = compound_order(page);
-
-				if (pageblock_skip_persistent(page, order)) {
-					set_pageblock_skip(page);
-					low_pfn = end_pfn;
-				} else
-					low_pfn += (1UL << order) - 1;
+				low_pfn += (1UL << compound_order(page)) - 1;
 				goto isolate_fail;
 			}
 		}
-- 
2.14.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux