On 06/29/2016 11:47 PM, David Rientjes wrote:
It's possible to isolate some freepages in a pageblock and then fail split_free_page() due to the low watermark check. In this case, we hit VM_BUG_ON() because the freeing scanner terminated early without a contended lock or enough freepages. This should never have been a VM_BUG_ON() since it's not a fatal condition. It should have been a VM_WARN_ON() at best, or even handled gracefully. Regardless, we need to terminate anytime the full pageblock scan was not done. The logic belongs in isolate_freepages_block(), so handle its state gracefully by terminating the pageblock loop and making a note to restart at the same pageblock next time since it was not possible to complete the scan this time. Reported-by: Minchan Kim <minchan@xxxxxxxxxx> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> --- Note: I really dislike the low watermark check in split_free_page() and consider it poor software engineering. The function should split a free page, nothing more. Terminating memory compaction because of a low watermark check when we're simply trying to migrate memory seems like an arbitrary heuristic. There was an objection to removing it in the first proposed patch, but I think we should really consider removing that check so this is simpler.
There's a patch changing it to min watermark (you were CC'd on the series). We could argue whether it belongs to split_free_page() or some wrapper of it, but I don't think removing it completely should be done. If zone is struggling with order-0 pages, a functionality for making higher-order pages shouldn't make it even worse. It's also not that arbitrary, even if we succeeded the migration and created a high-order page, the higher-order allocation would still fail due to watermark checks. Worse, __compact_finished() would keep telling the compaction to continue, creating an even longer lag, which is also against your recent patches.
mm/compaction.c | 37 +++++++++++++++---------------------- 1 file changed, 15 insertions(+), 22 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1009,8 +1009,6 @@ static void isolate_freepages(struct compact_control *cc) block_end_pfn = block_start_pfn, block_start_pfn -= pageblock_nr_pages, isolate_start_pfn = block_start_pfn) { - unsigned long isolated; - /* * This can iterate a massively long zone without finding any * suitable migration targets, so periodically check if we need @@ -1034,36 +1032,31 @@ static void isolate_freepages(struct compact_control *cc) continue; /* Found a block suitable for isolating free pages from. */ - isolated = isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, freelist, false); - /* If isolation failed early, do not continue needlessly */ - if (!isolated && isolate_start_pfn < block_end_pfn && - cc->nr_migratepages > cc->nr_freepages) - break; + isolate_freepages_block(cc, &isolate_start_pfn, block_end_pfn, + freelist, false); /* - * If we isolated enough freepages, or aborted due to async - * compaction being contended, terminate the loop. - * Remember where the free scanner should restart next time, - * which is where isolate_freepages_block() left off. - * But if it scanned the whole pageblock, isolate_start_pfn - * now points at block_end_pfn, which is the start of the next - * pageblock. - * In that case we will however want to restart at the start - * of the previous pageblock. + * If we isolated enough freepages, or aborted due to lock + * contention, terminate. */ if ((cc->nr_freepages >= cc->nr_migratepages) || cc->contended) { - if (isolate_start_pfn >= block_end_pfn) + if (isolate_start_pfn >= block_end_pfn) { + /* + * Restart at previous pageblock if more + * freepages can be isolated next time. + */
That's not as explanatory as before :/ oh well...
isolate_start_pfn = block_start_pfn - pageblock_nr_pages; + } break; - } else { + } else if (isolate_start_pfn < block_end_pfn) { /* - * isolate_freepages_block() should not terminate - * prematurely unless contended, or isolated enough + * If isolation failed early, do not continue + * needlessly. */ - VM_BUG_ON(isolate_start_pfn < block_end_pfn); + isolate_start_pfn = block_start_pfn;
Note that this reset shouldn't be in fact necessary - without it, next attempt would restart exactly at the pfn that we failed to split due to watermark checks. But not a big deal.
+ break; } }
-- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html