On 8/13/19 7:22 PM, David Rientjes wrote: > On Tue, 13 Aug 2019, Vlastimil Babka wrote: > >> > After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"), >> > struct page of reserved memory is zeroed. This causes page->flags to be 0 >> > and fixes issues related to reading /proc/kpageflags, for example, of >> > reserved memory. >> > >> > The VM_BUG_ON() in move_freepages_block(), however, assumes that >> > page_zone() is meaningful even for reserved memory. That assumption is no >> > longer true after the aforementioned commit. >> >> How comes that move_freepages_block() gets called on reserved memory in >> the first place? >> > > It's simply math after finding a valid free page from the per-zone free > area to use as fallback. We find the beginning and end of the pageblock > of the valid page and that can bring us into memory that was reserved per > the e820. pfn_valid() is still true (it's backed by a struct page), but > since it's zero'd we shouldn't make any inferences here about comparing > its node or zone. The current node check just happens to succeed most of > the time by luck because reserved memory typically appears on node 0. > > The fix here is to validate that we actually have buddy pages before > testing if there's any type of zone or node strangeness going on. I see, thanks. >> > @@ -2273,6 +2258,10 @@ static int move_freepages(struct zone *zone, >> > continue; >> > } >> > >> > + /* Make sure we are not inadvertently changing nodes */ >> > + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); >> > + VM_BUG_ON_PAGE(page_zone(page) != zone, page); >> >> The later check implies the former check, so if it's to stay, the first >> one could be removed and comment adjusted s/nodes/zones/ >> > > Does it? The first is checking for a corrupted page_to_nid the second is > checking for a corrupted or unexpected page_zone. What's being tested > here is the state of struct page, as it was previous to this patch, not > the state of struct zone. page_zone() calls page_to_nid() internally, so if nid was wrong, the resulting zone pointer would be also wrong. But if you want more fine grained bug output, that's fine.