On 08/29/23 11:33, Muchun Song wrote: > > > > On Aug 29, 2023, at 05:04, Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > > > On 08/28/23 19:33, Muchun Song wrote: > >> > >> > >>> On Aug 25, 2023, at 19:18, Usama Arif <usama.arif@xxxxxxxxxxxxx> wrote: > >>> > >>> The new boot flow when it comes to initialization of gigantic pages > >>> is as follows: > >>> - At boot time, for a gigantic page during __alloc_bootmem_hugepage, > >>> the region after the first struct page is marked as noinit. > >>> - This results in only the first struct page to be > >>> initialized in reserve_bootmem_region. As the tail struct pages are > >>> not initialized at this point, there can be a significant saving > >>> in boot time if HVO succeeds later on. > >>> - Later on in the boot, HVO is attempted. If its successful, only the first > >>> HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages > >>> after the head struct page are initialized. If it is not successful, > >>> then all of the tail struct pages are initialized. > >>> > >>> Signed-off-by: Usama Arif <usama.arif@xxxxxxxxxxxxx> > >> > >> This edition is simpler than before ever, thanks for your work. > >> > >> There is premise that other subsystems do not access vmemmap pages > >> before the initialization of vmemmap pages associated withe HugeTLB > >> pages allocated from bootmem for your optimization. However, IIUC, the > >> compacting path could access arbitrary struct page when memory fails > >> to be allocated via buddy allocator. So we should make sure that > >> those struct pages are not referenced in this routine. And I know > >> if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, it will encounter > >> the same issue, but I don't find any code to prevent this from > >> happening. I need more time to confirm this, if someone already knows, > >> please let me know, thanks. So I think HugeTLB should adopt the similar > >> way to prevent this. > > > > In this patch, the call to hugetlb_vmemmap_optimize() is moved BEFORE > > __prep_new_hugetlb_folio or prep_new_hugetlb_folio in all code paths. > > The prep_new_hugetlb_folio routine(s) are what set the destructor (soon > > to be a flag) that identifies the set of pages as a hugetlb page. So, > > there is now a window where a set of pages not identified as hugetlb > > will not have vmemmap pages. > > Thanks for your point it out. > > Seems this issue is not related to this change? hugetlb_vmemmap_optimize() > is called before the setting of destructor since the initial commit > f41f2ed43ca5. Right? > Thanks Muchun! Yes, this issue exists today. It was the further separation of the calls in this patch which pointed out the issue to me. I overlooked the fact that the issue already exists. :( > > > > Recently, I closed the same window in the hugetlb freeing code paths with > > commit 32c877191e02 'hugetlb: do not clear hugetlb dtor until allocating'. > > Yes, I saw it. > > > This patch needs to be reworked so that this window is not opened in the > > allocation paths. > > So I think the fix should be a separate series. > Right. I can fix that up separately. -- Mike Kravetz