On 09/06/23 22:27, Usama Arif wrote: > > > On 06/09/2023 19:10, Mike Kravetz wrote: > > On 09/06/23 12:26, Usama Arif wrote: > > > The new boot flow when it comes to initialization of gigantic pages > > > is as follows: > > > - At boot time, for a gigantic page during __alloc_bootmem_hugepage, > > > the region after the first struct page is marked as noinit. > > > - This results in only the first struct page to be > > > initialized in reserve_bootmem_region. As the tail struct pages are > > > not initialized at this point, there can be a significant saving > > > in boot time if HVO succeeds later on. > > > - Later on in the boot, the head page is prepped and the first > > > HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages > > > are initialized. > > > - HVO is attempted. If it is not successful, then the rest of the > > > tail struct pages are initialized. If it is successful, no more > > > tail struct pages need to be initialized saving significant boot time. > > > > Code looks reasonable. Quick question. > > > > On systems where HVO is disabled, we will still go through this new boot > > flow and init hugetlb tail pages later in boot (gather_bootmem_prealloc). > > Correct? > > If yes, will there be a noticeable change in performance from the current > > flow with HVO disabled? My concern would be allocating a large number of > > gigantic pages at boot (TB or more). > > > > Thanks for the review. > > The patch moves the initialization of struct pages backing hugepage from > reserve_bootmem_region to a bit later on in the boot to > gather_bootmem_prealloc. When HVO is disabled, there will be no difference > in time taken to boot with or without this patch series, as 262144 struct > pages per gigantic page (for x86) are still going to be initialized, just in > a different place. I seem to recall that 'normal' deferred struct page initialization was done in parallel as the result of these series: https://lore.kernel.org/linux-mm/20171013173214.27300-1-pasha.tatashin@xxxxxxxxxx/ https://lore.kernel.org/linux-mm/20200527173608.2885243-1-daniel.m.jordan@xxxxxxxxxx/#t and perhaps others. My thought is that we lose that parallel initialization when it is being done as part of hugetlb fall back initialization. Does that make sense? Or am I missing something? I do not have any proof that things will be slower. That is just something I was thinking about. -- Mike Kravetz