Hi, On Sat, Dec 05, 2020 at 04:54:01PM -0800, akpm@xxxxxxxxxxxxxxxxxxxx wrote: > > The patch titled > Subject: mm: initialize struct pages in reserved regions outside of the zone ranges > has been added to the -mm tree. Its filename is > mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch > > This patch should soon appear at > https://ozlabs.org/~akpm/mmots/broken-out/mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch > and later at > https://ozlabs.org/~akpm/mmotm/broken-out/mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch > > Before you just go and hit "reply", please: > a) Consider who else should be cc'ed > b) Prefer to cc a suitable mailing list as well > c) Ideally: find the original patch on the mailing list and do a > reply-to-all to that, adding suitable additional cc's > > *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** > > The -mm tree is included into linux-next and is updated > there every 3-4 working days > > ------------------------------------------------------ > From: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Subject: mm: initialize struct pages in reserved regions outside of the zone ranges > > Without this change, the pfn 0 isn't in any zone spanned range, and it's > also not in any memory.memblock range, so the struct page of pfn 0 wasn't > initialized and the PagePoison remained set when reserve_bootmem_region > called __SetPageReserved, inducing a silent boot failure with DEBUG_VM > (and correctly so, because the crash signaled the nodeid/nid of pfn 0 > would be again wrong). > > There's no enforcement that all memblock.reserved ranges must overlap > memblock.memory ranges, so the memblock.reserved ranges also require an > explicit initialization and the zones ranges need to be extended to > include all memblock.reserved ranges with struct pages too or they'll be > left uninitialized with PagePoison as it happened to pfn 0. > > Link: https://lkml.kernel.org/r/20201205013238.21663-2-aarcange@xxxxxxxxxx > Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") > Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx> > Cc: Baoquan He <bhe@xxxxxxxxxx> > Cc: David Hildenbrand <david@xxxxxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxxxx> > Cc: Qian Cai <cai@xxxxxx> > Cc: Vlastimil Babka <vbabka@xxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > include/linux/memblock.h | 17 ++++++++--- > mm/debug.c | 3 + > mm/memblock.c | 4 +- > mm/page_alloc.c | 57 +++++++++++++++++++++++++++++-------- > 4 files changed, 63 insertions(+), 18 deletions(-) I don't see why we need all this complexity when a simple fixup was enough. > --- a/include/linux/memblock.h~mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges > +++ a/include/linux/memblock.h ... > --- a/mm/page_alloc.c~mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges > +++ a/mm/page_alloc.c ... > @@ -6227,7 +6233,7 @@ void __init __weak memmap_init(unsigned > unsigned long zone, > unsigned long range_start_pfn) > { > - unsigned long start_pfn, end_pfn, next_pfn = 0; > + unsigned long start_pfn, end_pfn, prev_pfn = 0; > unsigned long range_end_pfn = range_start_pfn + size; > u64 pgcnt = 0; > int i; > @@ -6235,7 +6241,7 @@ void __init __weak memmap_init(unsigned > for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) { > start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn); > end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn); > - next_pfn = clamp(next_pfn, range_start_pfn, range_end_pfn); > + prev_pfn = clamp(prev_pfn, range_start_pfn, range_end_pfn); > > if (end_pfn > start_pfn) { > size = end_pfn - start_pfn; > @@ -6243,10 +6249,10 @@ void __init __weak memmap_init(unsigned > MEMINIT_EARLY, NULL, MIGRATE_MOVABLE); > } > > - if (next_pfn < start_pfn) > - pgcnt += init_unavailable_range(next_pfn, start_pfn, > + if (prev_pfn < start_pfn) > + pgcnt += init_unavailable_range(prev_pfn, start_pfn, > zone, nid); > - next_pfn = end_pfn; > + prev_pfn = end_pfn; > } > > /* > @@ -6256,12 +6262,31 @@ void __init __weak memmap_init(unsigned > * considered initialized. Make sure that memmap has a well defined > * state. > */ > - if (next_pfn < range_end_pfn) > - pgcnt += init_unavailable_range(next_pfn, range_end_pfn, > + if (prev_pfn < range_end_pfn) > + pgcnt += init_unavailable_range(prev_pfn, range_end_pfn, > zone, nid); > > + /* > + * memblock.reserved isn't enforced to overlap with > + * memblock.memory so initialize the struct pages for > + * memblock.reserved too in case it wasn't overlapping. > + * > + * If any struct page associated with a memblock.reserved > + * range isn't overlapping with a zone range, it'll be left > + * uninitialized, ideally with PagePoison, and it'll be a more > + * easily detectable error. > + */ > + for_each_res_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) { > + start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn); > + end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn); > + > + if (end_pfn > start_pfn) > + pgcnt += init_unavailable_range(start_pfn, end_pfn, > + zone, nid); > + } This means we are going iterate over all memory allocated before free_area_ini() from memblock extra time. One time here and another time in reserve_bootmem_region(). And this can be substantial for CMA and alloc_large_system_hash(). > + > if (pgcnt) > - pr_info("%s: Zeroed struct page in unavailable ranges: %lld\n", > + pr_info("%s: pages in unavailable ranges: %lld\n", > zone_names[zone], pgcnt); > } > > @@ -6499,6 +6524,10 @@ void __init get_pfn_range_for_nid(unsign > *start_pfn = min(*start_pfn, this_start_pfn); > *end_pfn = max(*end_pfn, this_end_pfn); > } > + for_each_res_pfn_range(i, nid, &this_start_pfn, &this_end_pfn, NULL) { > + *start_pfn = min(*start_pfn, this_start_pfn); > + *end_pfn = max(*end_pfn, this_end_pfn); > + } > > if (*start_pfn == -1UL) > *start_pfn = 0; > @@ -7126,7 +7155,13 @@ unsigned long __init node_map_pfn_alignm > */ > unsigned long __init find_min_pfn_with_active_regions(void) > { > - return PHYS_PFN(memblock_start_of_DRAM()); > + /* > + * reserved regions must be included so that their page > + * structure can be part of a zone and obtain a valid zoneid > + * before __SetPageReserved(). > + */ > + return min(PHYS_PFN(memblock_start_of_DRAM()), > + PHYS_PFN(memblock.reserved.regions[0].base)); So this implies that reserved memory starts before memory. Don't you find this weird? > } > > /* > _ > > Patches currently in -mm which might be from aarcange@xxxxxxxxxx are > > mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch > -- Sincerely yours, Mike.