On Wed, Mar 11, 2020 at 08:38:48PM +0800, Shile Zhang wrote: Sorry, I'm late to this. I don't have a better solution, but I did try to find a way to stop holding the resize lock during (most of) page init, which would make this fix unnecessary and the deferred_init_memmap context less strange. Here are some ideas that didn't work out in case someone sees a different way forward. One thought is to unify the common parts of deferred_init_memmap and deferred_grow_zone and have callers grab chunks of pages to initialize and note the next available page to initialize for the next caller. Interrupt handlers participate in page init while it's happening rather than having to wait until it's finished. But what if a partially completed chunk is interrupted midway through and the interrupt handler needs to allocate those in-progress pages? May be possible to guarantee some memory is available if some minimum number of chunks have been completed already, but it's hard to say what that number is if the amount of memory handlers might allocate is unbounded. Given that large allocations from interrupt handlers is a theoretical issue, another thought is to reserve one section for deferred_grow_zone, should it be called during page init, and if not then the pgdatinit thread could initialize it with the resize lock held after the rest of page init is finished. Meanwhile regular page init need not hold the resize lock. If interrupt handlers try to allocate more than a section during this time, trigger a warning so we know the issue isn't theoretical. The downside is that it's possible this may not fix it for good. > @@ -1811,9 +1816,23 @@ static int __init deferred_init_memmap(v > * that we can avoid introducing any issues with the buddy > * allocator. > */ > - while (spfn < epfn) > + while (spfn < epfn) { > nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); > + /* > + * Release the interrupts for every TICK_PAGE_COUNT pages > + * (128MB) to give tick timer the chance to update the > + * system jiffies. > + */ > + if ((nr_pages - prev_nr_pages) > TICK_PAGE_COUNT) { > + prev_nr_pages = nr_pages; > + pgdat->first_deferred_pfn = spfn; > + pgdat_resize_unlock(pgdat, &flags); > + goto again; > + } > + } > + Nits only: - s/Release the interrupts/Enable interrupts/ - take out 128MB, that assumes PAGE_SIZE is 4k I considered saving i, spfn, and epfn in pgdat to avoid having to rerun deferred_init_mem_pfn_range_in_zone every retry, but it'd enlarge pgdat for short-lived data and the function probably isn't expensive. Regardless, Reviewed-by: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx>