On 10/11/19 4:06 PM, David Hildenbrand wrote: > From: Qian Cai <cai@xxxxxx> > > Uninitialized memmaps contain garbage and in the worst case trigger > kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get > touched. > > For example, when not onlining a memory block that is spanned by a zone > and reading /proc/pagetypeinfo with CONFIG_DEBUG_VM_PGFLAGS and > CONFIG_PAGE_POISONING, we can trigger a kernel BUG: > > :/# echo 1 > /sys/devices/system/memory/memory40/online > :/# echo 1 > /sys/devices/system/memory/memory42/online > :/# cat /proc/pagetypeinfo > test.file > [ 42.489856] page:fffff2c585200000 is uninitialized and poisoned > [ 42.489861] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > [ 42.492235] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > [ 42.493501] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > [ 42.494533] There is not page extension available. > [ 42.495358] ------------[ cut here ]------------ > [ 42.496163] kernel BUG at include/linux/mm.h:1107! > [ 42.497069] invalid opcode: 0000 [#1] SMP NOPTI > > Please not that this change does not affect ZONE_DEVICE, because > pagetypeinfo_showmixedcount_print() is called from > mm/vmstat.c:pagetypeinfo_showmixedcount() only for populated zones, and > ZONE_DEVICE is never populated (zone->present_pages always 0). > > Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visible after d0dc12e86b319 > Signed-off-by: Qian Cai <cai@xxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Vlastimil Babka <vbabka@xxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxx> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: "Peter Zijlstra (Intel)" <peterz@xxxxxxxxxxxxx> > Cc: Miles Chen <miles.chen@xxxxxxxxxxxx> > Cc: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx> > Cc: Qian Cai <cai@xxxxxx> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > [ move check to outer loop, add comment, rephrase description ] > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > > Cai asked me to follow up on: > [PATCH] mm/page_owner: fix a crash after memory offline > > --- > mm/page_owner.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/mm/page_owner.c b/mm/page_owner.c > index dee931184788..7d149211f6be 100644 > --- a/mm/page_owner.c > +++ b/mm/page_owner.c > @@ -284,7 +284,8 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m, > * not matter as the mixed block count will still be correct > */ > for (; pfn < end_pfn; ) { > - if (!pfn_valid(pfn)) { > + page = pfn_to_online_page(pfn); > + if (!page) { > pfn = ALIGN(pfn + 1, MAX_ORDER_NR_PAGES); > continue; > } > @@ -292,13 +293,13 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m, > block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages); > block_end_pfn = min(block_end_pfn, end_pfn); > > - page = pfn_to_page(pfn); > pageblock_mt = get_pageblock_migratetype(page); > > for (; pfn < block_end_pfn; pfn++) { > if (!pfn_valid_within(pfn)) > continue; > > + /* The pageblock is online, no need to recheck. */ > page = pfn_to_page(pfn); > > if (page_zone(page) != zone) >