On Wed 20-09-17 16:17:03, Pavel Tatashin wrote: > Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT), > flags and other fields in "struct page"es are never changed prior to first > initializing struct pages by going through __init_single_page(). > > With deferred struct page feature enabled, however, we set fields in > register_page_bootmem_info that are subsequently clobbered right after in > free_all_bootmem: > > mem_init() { > register_page_bootmem_info(); > free_all_bootmem(); > ... > } > > When register_page_bootmem_info() is called only non-deferred struct pages > are initialized. But, this function goes through some reserved pages which > might be part of the deferred, and thus are not yet initialized. > > mem_init > register_page_bootmem_info > register_page_bootmem_info_node > get_page_bootmem > .. setting fields here .. > such as: page->freelist = (void *)type; > > free_all_bootmem() > free_low_memory_core_early() > for_each_reserved_mem_region() > reserve_bootmem_region() > init_reserved_page() <- Only if this is deferred reserved page > __init_single_pfn() > __init_single_page() > memset(0) <-- Loose the set fields here > > We end-up with issue where, currently we do not observe problem as memory > is explicitly zeroed. But, if flag asserts are changed we can start hitting > issues. > > Also, because in this patch series we will stop zeroing struct page memory > during allocation, we must make sure that struct pages are properly > initialized prior to using them. > > The deferred-reserved pages are initialized in free_all_bootmem(). > Therefore, the fix is to switch the above calls. Thanks for extending the changelog. This is more informative now. > Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> > Reviewed-by: Steven Sistare <steven.sistare@xxxxxxxxxx> > Reviewed-by: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> > Reviewed-by: Bob Picco <bob.picco@xxxxxxxxxx> I hope I haven't missed anything but it looks good to me. Acked-by: Michal Hocko <mhocko@xxxxxxxx> one nit below > --- > arch/x86/mm/init_64.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 5ea1c3c2636e..30fe22558720 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1182,12 +1182,17 @@ void __init mem_init(void) > > /* clear_bss() already clear the empty_zero_page */ > > - register_page_bootmem_info(); > - > /* this will put all memory onto the freelists */ > free_all_bootmem(); > after_bootmem = 1; > > + /* Must be done after boot memory is put on freelist, because here we standard code style is to do /* * text starts here > + * might set fields in deferred struct pages that have not yet been > + * initialized, and free_all_bootmem() initializes all the reserved > + * deferred pages for us. > + */ > + register_page_bootmem_info(); > + > /* Register memory areas for /proc/kcore */ > kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR, > PAGE_SIZE, KCORE_OTHER); > -- > 2.14.1 -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>