On Mon, 25 Mar 2019 at 15:58, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote: > > Ok, it's somewhat of a pity that we don't know what PFN that page > corresponds to. Specifically it would be interesting to know if the PFN > corresponds to a memory hole as DMA32 on your machine has a number of > gaps. What I'm wondering is if the reinit fails to find good starting > points that it picks a PFN that corresponds to an uninitialised page and > trips up later. > > Can you try again with this patch please? It replaces the failed patch > entirely. > > Thanks. > > diff --git a/mm/compaction.c b/mm/compaction.c > index f171a83707ce..caac4b07eb33 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -242,6 +242,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source, > bool check_target) > { > struct page *page = pfn_to_online_page(pfn); > + struct page *block_page; > struct page *end_page; > unsigned long block_pfn; > > @@ -267,20 +268,26 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source, > get_pageblock_migratetype(page) != MIGRATE_MOVABLE) > return false; > > + /* Ensure the start of the pageblock or zone is online and valid */ > + block_pfn = pageblock_start_pfn(pfn); > + block_page = pfn_to_online_page(max(block_pfn, zone->zone_start_pfn)); > + if (block_page) { > + page = block_page; > + pfn = block_pfn; > + } > + > + /* Ensure the end of the pageblock or zone is online and valid */ > + block_pfn += pageblock_nr_pages; > + block_pfn = min(block_pfn, zone_end_pfn(zone)); > + end_page = pfn_to_online_page(block_pfn); > + if (!end_page) > + return false; > + > /* > * Only clear the hint if a sample indicates there is either a > * free page or an LRU page in the block. One or other condition > * is necessary for the block to be a migration source/target. > */ > - block_pfn = pageblock_start_pfn(pfn); > - pfn = max(block_pfn, zone->zone_start_pfn); > - page = pfn_to_page(pfn); > - if (zone != page_zone(page)) > - return false; > - pfn = block_pfn + pageblock_nr_pages; > - pfn = min(pfn, zone_end_pfn(zone)); > - end_page = pfn_to_page(pfn); > - > do { > if (pfn_valid_within(pfn)) { > if (check_source && PageLRU(page)) { > @@ -320,6 +327,16 @@ static void __reset_isolation_suitable(struct zone *zone) > > zone->compact_blockskip_flush = false; > > + > + /* > + * Re-init the scanners and attempt to find a better starting > + * position below. This may result in redundant scanning if > + * a better position is not found but it avoids the corner > + * case whereby the cached PFNs are left in a memory hole with > + * no proper struct page backing it. > + */ > + reset_cached_positions(zone); > + > /* > * Walk the zone and update pageblock skip information. Source looks > * for PageLRU while target looks for PageBuddy. When the scanner > @@ -349,13 +366,6 @@ static void __reset_isolation_suitable(struct zone *zone) > zone->compact_cached_free_pfn = reset_free; > } > } > - > - /* Leave no distance if no suitable block was reset */ > - if (reset_migrate >= reset_free) { > - zone->compact_cached_migrate_pfn[0] = migrate_pfn; > - zone->compact_cached_migrate_pfn[1] = migrate_pfn; > - zone->compact_cached_free_pfn = free_pfn; > - } > } > > void reset_isolation_suitable(pg_data_t *pgdat) > > -- > Mel Gorman > SUSE Labs Kernel panic are still occurs. -- Best Regards, Mike Gavrilov.
Attachment:
system-log12.tar.xz
Description: application/xz