On Wed, Nov 25, 2020 at 07:45:30AM +0100, David Hildenbrand wrote: > > Something must have changed more recently than v5.1 that caused the > > zoneid of reserved pages to be wrong, a possible candidate for the > > real would be this change below: > > > > + __init_single_page(pfn_to_page(pfn), pfn, 0, 0); > > > > Before that change, the memmap of memory holes were only zeroed out. So the zones/nid was 0, however, pages were not reserved and had a refcount of zero - resulting in other issues. > > Most pfn walkers shouldn???t mess with reserved pages and simply skip them. That would be the right fix here. > Ordinarily yes, pfn walkers should not care about reserved pages but it's still surprising that the node/zone linkages would be wrong for memory holes. If they are in the middle of a zone, it means that a hole with valid struct pages could be mistaken for overlapping nodes (if the hole was in node 1 for example) or overlapping zones which is just broken. > > > > Whenever pfn_valid is true, it's better that the zoneid/nid is correct > > all times, otherwise if the second stage fails we end up in a bug with > > weird side effects. > > Memory holes with a valid memmap might not have a zone/nid. For now, skipping reserved pages should be good enough, no? > It would partially paper over the issue that setting the pageblock type based on a reserved page. I agree that compaction should not be returning pfns that are outside of the zone range because that is buggy in itself but valid struct pages should have valid information. I don't think we want to paper over that with unnecessary PageReserved checks. -- Mel Gorman SUSE Labs