On 25 Aug 2022 11:36:31 +0200, vbabka@xxxxxxx wrote: >On 8/25/22 08:31, lizhe.67@xxxxxxxxxxxxx wrote: >> From: Li Zhe <lizhe.67@xxxxxxxxxxxxx> >> >> In 'commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init"")', >> we call page_ext_init() after page_alloc_init_late() to avoid some panic >> problem. It seems that we cannot track early page allocations in current >> kernel even if page structure has been initialized early. >> >> This patch introduce a new boot parameter 'early_page_ext' to resolve this >> problem. If we pass it to kernel, function page_ext_init() will be moved >> up and feature 'deferred initialization of struct pages' will be disabled. >> It can help us to catch early page allocations. This is useful especially >> when we find that the free memory value is not the same right after >> different kernel booting. >> >> Changelogs: >> >> v1->v2: >> - use a cmd line parameter to move up function page_ext_init() instead of >> using CONFIG_DEFERRED_STRUCT_PAGE_INIT >> - fix oom problem[1] >> >> v2->v3: >> - make adjustments suggested by Michal Hocko >> >> v1 patch: https://lore.kernel.org/lkml/Yv3r6Y1vh+6AbY4+@xxxxxxxxxxxxxx/T/ > >IIRC v1 failed to boot in some automatic bot test. Will this not fail with >the same config/hw combination when the parameter is passed? There is no problem with or without the parameter now. > >> v2 patch: https://lore.kernel.org/lkml/20220824065058.81051-1-lizhe.67@xxxxxxxxxxxxx/T/ >> >> [1]: https://lore.kernel.org/linux-mm/YwHmXLu5txij+p35@xsang-OptiPlex-9020/ >> >> Suggested-by: Michal Hocko <mhocko@xxxxxxxx> >> Signed-off-by: Li Zhe <lizhe.67@xxxxxxxxxxxxx> >> --- >> Documentation/admin-guide/kernel-parameters.txt | 6 ++++++ >> include/linux/page_ext.h | 11 +++++++++++ >> init/main.c | 6 +++++- >> mm/page_alloc.c | 2 ++ >> mm/page_ext.c | 12 ++++++++++++ >> 5 files changed, 36 insertions(+), 1 deletion(-) >> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt >> index d7f30902fda0..7b5726828ac0 100644 >> --- a/Documentation/admin-guide/kernel-parameters.txt >> +++ b/Documentation/admin-guide/kernel-parameters.txt >> @@ -1471,6 +1471,12 @@ >> Permit 'security.evm' to be updated regardless of >> current integrity status. >> >> + early_page_ext [KNL] Boot-time early page_ext initializing option. >> + This boot parameter disables the deferred initialization >> + of struct page and move up function page_ext_init() in >> + order to catch early page allocations. Available with >> + CONFIG_PAGE_EXTENSION=y. >> + >> failslab= >> fail_usercopy= >> fail_page_alloc= >> diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h >> index fabb2e1e087f..68d690649234 100644 >> --- a/include/linux/page_ext.h >> +++ b/include/linux/page_ext.h >> @@ -36,9 +36,15 @@ struct page_ext { >> unsigned long flags; >> }; >> >> +extern bool early_page_ext; >> extern unsigned long page_ext_size; >> extern void pgdat_page_ext_init(struct pglist_data *pgdat); >> >> +static inline bool early_page_ext_enable(void) >> +{ >> + return early_page_ext; >> +} > >I think it should better be named early_page_ext_enabled() as it returns the >status, not sets it to true? Yes you are right. I will fix it in my v4 patch. Thanks! > >> + >> #ifdef CONFIG_SPARSEMEM >> static inline void page_ext_init_flatmem(void) >> { >> @@ -67,6 +73,11 @@ static inline struct page_ext *page_ext_next(struct page_ext *curr) >> #else /* !CONFIG_PAGE_EXTENSION */ >> struct page_ext; >> >> +static inline bool early_page_ext_enable(void) >> +{ >> + return false; >> +} >> + >> static inline void pgdat_page_ext_init(struct pglist_data *pgdat) >> { >> } >> diff --git a/init/main.c b/init/main.c >> index 91642a4e69be..d95edb67a499 100644 >> --- a/init/main.c >> +++ b/init/main.c >> @@ -849,6 +849,9 @@ static void __init mm_init(void) >> pgtable_init(); >> debug_objects_mem_init(); >> vmalloc_init(); >> + /* Should be run after vmap initialization */ >> + if (early_page_ext_enable()) >> + page_ext_init(); >> /* Should be run before the first non-init thread is created */ >> init_espfix_bsp(); >> /* Should be run after espfix64 is set up. */ >> @@ -1606,7 +1609,8 @@ static noinline void __init kernel_init_freeable(void) >> padata_init(); >> page_alloc_init_late(); >> /* Initialize page ext after all struct pages are initialized. */ >> - page_ext_init(); >> + if (!early_page_ext_enable()) >> + page_ext_init(); >> >> do_basic_setup(); >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index e5486d47406e..e580b197aa1e 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -482,6 +482,8 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn) >> { >> static unsigned long prev_end_pfn, nr_initialised; >> >> + if (early_page_ext_enable()) >> + return false; >> /* >> * prev_end_pfn static that contains the end of previous zone >> * No need to protect because called very early in boot before smp_init. >> diff --git a/mm/page_ext.c b/mm/page_ext.c >> index 3dc715d7ac29..bf4f2a12d7dc 100644 >> --- a/mm/page_ext.c >> +++ b/mm/page_ext.c >> @@ -85,6 +85,18 @@ unsigned long page_ext_size = sizeof(struct page_ext); >> >> static unsigned long total_usage; >> >> +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT >> +bool early_page_ext __meminitdata; >> +#else >> +bool early_page_ext __meminitdata = true; >> +#endif >> +static int __init setup_early_page_ext(char *str) >> +{ >> + early_page_ext = true; >> + return 0; >> +} >> +early_param("early_page_ext", setup_early_page_ext); >> + >> static bool __init invoke_need_callbacks(void) >> { >> int i;