Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 09, 2023 at 08:43:40PM +0100, Vlastimil Babka wrote:
> On 7/19/22 10:28, Mel Gorman wrote:
> > On Fri, Jul 15, 2022 at 11:26:01PM +0800, Chuyi Zhou wrote:
> >> You are right, we do need some machenism to ensure mark skipped after
> >> scanned a !IS_ALIGNED block from fast_find. However, I think the following
> >> code may not working. Because *skip_updated* has been reset:
> >>     if (!skip_updated) {
> >>         skip_updated = true;
> >>         if (test_and_set_skip(cc, page, low_pfn))
> >>             goto isolate_abort;
> >>     }
> >> Why not ignore skip_updated after scanned a block, just like this:
> >> 
> >> diff --git a/mm/compaction.c b/mm/compaction.c
> >> index 962d05d1e187..1c388c45f127 100644
> >> --- a/mm/compaction.c
> >> +++ b/mm/compaction.c
> >> @@ -1180,7 +1180,7 @@ isolate_migratepages_block(struct compact_control *cc,
> >> unsigned long low_pfn,
> >>          * rescanned twice in a row.
> >>          */
> >>         if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
> >> -               if (valid_page && !skip_updated)
> >> +               if (valid_page)
> >>                         set_pageblock_skip(valid_page);
> >>                 update_cached_migrate(cc, low_pfn);
> >>         }
> >> 
> > 
> > Because valid_page still needs to be set but ensuring valid_page is set
> > is sufficient.
> > 
> > The pageblock will be marked for skip on the first locking of a lruvec
> > 
> >                         /* Try get exclusive access under lock */
> >                         if (!skip_updated) {
> >                                 skip_updated = true;
> >                                 if (test_and_set_skip(cc, page, low_pfn))
> >                                         goto isolate_abort;
> >                         }
> > 
> > That will happen regardless of alignment.
> 
> I'm not sure about that as test_and_set_skip() has "if
> (!pageblock_aligned(pfn)) return false". So I think it only takes for the
> first pfn in pageblock to not be PageLRU and we don't try to get the
> exclusive access for it, and the following pfn's will not be aligned, and
> thus we never set_pageblock_skip() through this path. If we have nr_isolated
> > 0 and not cc->rescan, we might not set_pageblock_skip() through the hunk
> above neither.
> 

True.

> I've been checking this area because we have some reports [1] for upstream
> 6.1 causing some long loops in khugepaged pegging 100% CPU in compaction.
> Tracepoint data suggests we keep (successfully) isolating migratepages over
> and over through fast_find_migrateblock() (the pfn ranges are not linearly
> increasing but there's a cycle probably as we rotate the freelist), but are
> failing to migrate them (for some reason). Between 6.0 and 6.1 there's this
> patch as commit 7efc3b726103 so I strongly suspect it's the root cause and
> will be providing a kernel with revert to in the bug to confirm.
> Consider this an early heads up :)
> 
> [1] https://bugzilla.suse.com/show_bug.cgi?id=1206848

The trace shows that there is a lot of rescanning between a small subset
of pageblocks because the conditions for marking the block skip are not
met. The scan is not reaching the end of the pageblock because enough
pages were isolated, but none were migrated successfully. Eventually it
circles back to the same block.

Pageblock skipping is tricky because we are trying to minimise both
latency and excessive scanning. However, tracking exactly when a block
is fully scanned requires an excessive amount of state.

One potential mitigating factor is to forcibly scan a pageblock when
migrations fail even though it could be for transient reasons such as
page writeback or page dirty. This will sometimes migrate too many pages
(worst case 500) but pageblocks will be marked skip.

This is not tested at all

diff --git a/mm/compaction.c b/mm/compaction.c
index ca1603524bbe..9242fcaeb09c 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2387,6 +2387,7 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
 			cc->rescan = true;
 		}
 
+retry_rescan:
 		switch (isolate_migratepages(cc)) {
 		case ISOLATE_ABORT:
 			ret = COMPACT_CONTENDED;
@@ -2429,15 +2430,28 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
 				goto out;
 			}
 			/*
-			 * We failed to migrate at least one page in the current
-			 * order-aligned block, so skip the rest of it.
+			 * If an ASYNC or SYNC_LIGHT fails to migrate a page
+			 * within the current order-aligned block, rescan the
+			 * entire block. Once the whole block is scanned, it'll
+			 * be marked "skip" to avoid rescanning in the near
+			 * future. This will isolate more pages than necessary
+			 * for the request but avoid loops due to
+			 * fast_find_migrateblock revisiting blocks that were
+			 * recently partially scanned.
 			 */
-			if (cc->direct_compaction &&
-						(cc->mode == MIGRATE_ASYNC)) {
-				cc->migrate_pfn = block_end_pfn(
-						cc->migrate_pfn - 1, cc->order);
-				/* Draining pcplists is useless in this case */
-				last_migrated_pfn = 0;
+			if (cc->direct_compaction && !cc->rescan &&
+						(cc->mode < MIGRATE_SYNC)) {
+				cc->rescan = true;
+
+				/*
+				 * Draining pcplists does not help THP if
+				 * any page failed to migrate. Even after
+				 * drain, the pageblock will not be free.
+				 */
+				if (cc->order == HUGETLB_PAGE_ORDER)
+					last_migrated_pfn = 0;
+
+				goto retry_rescan;
 			}
 		}
 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux