On Thu, Nov 26, 2020 at 11:05:14AM +0100, David Hildenbrand wrote: > I agree that this is sub-optimal, as such pages are impossible to detect > (PageReserved is just not clear as discussed with Andrea). The basic > question is how we want to proceed: > > a) Make sure any online struct page has a valid nid/zid, and is spanned > by the nid/zid. > b) Use a fake nid that will bail out when used for page_zone() and > page_pgdat(), and make pfn walkers detect that. > > AFAIU, Andrea seems to prefer a). I thing b) might be easier in corner > cases. Thoughts? Yes that's a good summary. My reason I slightly prefer a) is that it will perform faster at runtime. I seen also the proposed patch that adds the pfn_to_page embedded in pfn_valid to show those pfn as invalid, that is especially slow and I don't like it for that reason. Additional bugchecks for NO_NODEID will slowdown things too, so they should be justified by some benefit. It concerns me if pfn_valid becomes slower. b) will also make compaction bail out on that pageblock which it doesn't need to so it'll provide a worse runtime to compaction as well. a) is what the code already does if only the e820 map range was reserved with memblock_reserved, after applying Mike's patch to initialize reserved memblock regions too. It turns out the VM_BUG_ON_PAGE, as far as compaction is concerned, is just a false positive, it detected a broken VM invariant, but it was fine to proceed in the DEBUG_VM=n case that wouldn't crash. Before da50c57bdbb8e37ec6f8c934a2f8acbf4e4fdfb9 the struct page corresponding to the e820 unknown type range page wouldn't be marked PageReserved, that also looked wrong but it was also basically harmless. Neither of the two defects is too bad: it could be ignored if we just remove the bugcheck, but it's just nice if the bugcheck turn out to be correct in the pageblock. If we initialize all RAM and non-RAM ranges in the same way, then there's no discrepancy. Mike's patch seem to do just that by walking the memblock.reserved memblocks in the same way as the memblock.memory. NOTE: I would also much prefer b) if only it would guarantee zero runtime slowdown. (Which is I especially dislike pfn_valid internally calling pfn_to_page) Thanks, Andrea