On Thu, Aug 31, 2023 at 02:21:06PM +0800, Muchun Song wrote: > > > > On Aug 30, 2023, at 18:27, Usama Arif <usama.arif@xxxxxxxxxxxxx> wrote: > > On 28/08/2023 12:33, Muchun Song wrote: > >>> On Aug 25, 2023, at 19:18, Usama Arif <usama.arif@xxxxxxxxxxxxx> wrote: > >>> > >>> The new boot flow when it comes to initialization of gigantic pages > >>> is as follows: > >>> - At boot time, for a gigantic page during __alloc_bootmem_hugepage, > >>> the region after the first struct page is marked as noinit. > >>> - This results in only the first struct page to be > >>> initialized in reserve_bootmem_region. As the tail struct pages are > >>> not initialized at this point, there can be a significant saving > >>> in boot time if HVO succeeds later on. > >>> - Later on in the boot, HVO is attempted. If its successful, only the first > >>> HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages > >>> after the head struct page are initialized. If it is not successful, > >>> then all of the tail struct pages are initialized. > >>> > >>> Signed-off-by: Usama Arif <usama.arif@xxxxxxxxxxxxx> > >> This edition is simpler than before ever, thanks for your work. > >> There is premise that other subsystems do not access vmemmap pages > >> before the initialization of vmemmap pages associated withe HugeTLB > >> pages allocated from bootmem for your optimization. However, IIUC, the > >> compacting path could access arbitrary struct page when memory fails > >> to be allocated via buddy allocator. So we should make sure that > >> those struct pages are not referenced in this routine. And I know > >> if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, it will encounter > >> the same issue, but I don't find any code to prevent this from > >> happening. I need more time to confirm this, if someone already knows, > >> please let me know, thanks. So I think HugeTLB should adopt the similar > >> way to prevent this. > >> Thanks. > > > > Thanks for the reviews. > > > > So if I understand it correctly, the uninitialized pages due to the optimization in this patch and due to DEFERRED_STRUCT_PAGE_INIT should be treated in the same way during compaction. I see that in isolate_freepages during compaction there is a check to see if PageBuddy flag is set and also there are calls like __pageblock_pfn_to_page to check if the pageblock is valid. > > > > But if the struct page is uninitialized then they would contain random data and these checks could pass if certain bits were set? > > > > Compaction is done on free list. I think the uninitialized struct pages atleast from DEFERRED_STRUCT_PAGE_INIT would be part of freelist, so I think their pfn would be considered for compaction. > > > > Could someone more familiar with DEFERRED_STRUCT_PAGE_INIT and compaction confirm how the uninitialized struct pages are handled when compaction happens? Thanks! > > Hi Mel, > > Could you help us answer this question? I think you must be the expert of > CONFIG_DEFERRED_STRUCT_PAGE_INIT. I summarize the context here. As we all know, > some struct pages are uninnitialized when CONFIG_DEFERRED_STRUCT_PAGE_INIT is > enabled, if someone allocates a larger memory (e.g. order is 4) via buddy > allocator and fails to allocate the memory, then we will go into the compacting > routine, which will traverse all pfns and use pfn_to_page to access its struct > page, however, those struct pages may be uninnitialized (so it's arbitrary data). > Our question is how to prevent the compacting routine from accessing those > uninitialized struct pages? We'll be appreciated if you know the answer. > I didn't check the code but IIRC, the struct pages should be at least valid and not contain arbitrary data once page_alloc_init_late finishes. -- Mel Gorman SUSE Labs