On Tue, 26 Mar 2019 at 01:31, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote: > > > Ok, thanks. > > Trying one last time before putting together a debugging patch to see > exactly what PFNs are triggering as I still have not reproduced this on a > local machine. This is another replacement that is based on the assumption > that it's the free_pfn at the end of the zone that is triggering the > warning and it happens to be the case the end of a zone is aligned. Sorry > for the frustration with this and for persisting. > > diff --git a/mm/compaction.c b/mm/compaction.c > index f171a83707ce..b4930bf93c8a 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -242,6 +242,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source, > bool check_target) > { > struct page *page = pfn_to_online_page(pfn); > + struct page *block_page; > struct page *end_page; > unsigned long block_pfn; > > @@ -267,20 +268,26 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source, > get_pageblock_migratetype(page) != MIGRATE_MOVABLE) > return false; > > + /* Ensure the start of the pageblock or zone is online and valid */ > + block_pfn = pageblock_start_pfn(pfn); > + block_page = pfn_to_online_page(max(block_pfn, zone->zone_start_pfn)); > + if (block_page) { > + page = block_page; > + pfn = block_pfn; > + } > + > + /* Ensure the end of the pageblock or zone is online and valid */ > + block_pfn += pageblock_nr_pages; > + block_pfn = min(block_pfn, zone_end_pfn(zone) - 1); > + end_page = pfn_to_online_page(block_pfn); > + if (!end_page) > + return false; > + > /* > * Only clear the hint if a sample indicates there is either a > * free page or an LRU page in the block. One or other condition > * is necessary for the block to be a migration source/target. > */ > - block_pfn = pageblock_start_pfn(pfn); > - pfn = max(block_pfn, zone->zone_start_pfn); > - page = pfn_to_page(pfn); > - if (zone != page_zone(page)) > - return false; > - pfn = block_pfn + pageblock_nr_pages; > - pfn = min(pfn, zone_end_pfn(zone)); > - end_page = pfn_to_page(pfn); > - > do { > if (pfn_valid_within(pfn)) { > if (check_source && PageLRU(page)) { > @@ -309,7 +316,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source, > static void __reset_isolation_suitable(struct zone *zone) > { > unsigned long migrate_pfn = zone->zone_start_pfn; > - unsigned long free_pfn = zone_end_pfn(zone); > + unsigned long free_pfn = zone_end_pfn(zone) - 1; > unsigned long reset_migrate = free_pfn; > unsigned long reset_free = migrate_pfn; > bool source_set = false; > > > > -- > Mel Gorman > SUSE Labs I do not want to hurry, but it looks like this patch has fixed the problem. I will watch for a day. But the system has already experienced a night without a hang (kernel panic). -- Best Regards, Mike Gavrilov.