On 9 Dec 2024, at 14:01, Vlastimil Babka wrote: > On 12/6/24 10:59, David Hildenbrand wrote: >> Let's special-case for the common scenarios that: >> >> (a) We are freeing pages <= pageblock_order >> (b) We are freeing a page <= MAX_PAGE_ORDER and all pageblocks match >> (especially, no mixture of isolated and non-isolated pageblocks) > > Well in many of those cases we could also just adjust the pageblocks... But > perhaps they indeed shouldn't differ in the first place, unless there's an > isolation attempt. > >> When we encounter a > MAX_PAGE_ORDER page, it can only come from >> alloc_contig_range(), and we can process MAX_PAGE_ORDER chunks. >> >> When we encounter a >pageblock_order <= MAX_PAGE_ORDER page, >> check whether all pageblocks match, and if so (common case), don't >> split them up just for the buddy to merge them back. >> >> This makes sure that when we free MAX_PAGE_ORDER chunks to the buddy, >> for example during system startups, memory onlining, or when isolating >> consecutive pageblocks via alloc_contig_range()/memory offlining, that >> we don't unnecessarily split up what we'll immediately merge again, >> because the migratetypes match. >> >> Rename split_large_buddy() to __free_one_page_maybe_split(), to make it >> clearer what's happening, and handle in it only natural buddy orders, >> not the alloc_contig_range(__GFP_COMP) special case: handle that in >> free_one_page() only. >> >> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> > > Acked-by: Vlastimil Babka <vbabka@xxxxxxx > > Hm but noticed something: > >> +static void __free_one_page_maybe_split(struct zone *zone, struct page *page, >> + unsigned long pfn, int order, fpi_t fpi_flags) >> +{ >> + const unsigned long end_pfn = pfn + (1 << order); >> + int mt = get_pfnblock_migratetype(page, pfn); >> + >> + VM_WARN_ON_ONCE(order > MAX_PAGE_ORDER); >> VM_WARN_ON_ONCE(!IS_ALIGNED(pfn, 1 << order)); >> /* Caller removed page from freelist, buddy info cleared! */ >> VM_WARN_ON_ONCE(PageBuddy(page)); >> >> - if (order > pageblock_order) >> - order = pageblock_order; >> - >> - while (pfn != end) { >> - int mt = get_pfnblock_migratetype(page, pfn); >> + /* >> + * With CONFIG_MEMORY_ISOLATION, we might be freeing MAX_ORDER_NR_PAGES >> + * pages that cover pageblocks with different migratetypes; for example >> + * only some migratetypes might be MIGRATE_ISOLATE. In that (unlikely) >> + * case, fallback to freeing individual pageblocks so they get put >> + * onto the right lists. >> + */ >> + if (!IS_ENABLED(CONFIG_MEMORY_ISOLATION) || >> + likely(order <= pageblock_order) || >> + pfnblock_migratetype_equal(pfn + pageblock_nr_pages, end_pfn, mt)) { >> + __free_one_page(page, pfn, zone, order, mt, fpi_flags); >> + return; >> + } >> >> - __free_one_page(page, pfn, zone, order, mt, fpi); >> - pfn += 1 << order; >> + while (pfn != end_pfn) { >> + mt = get_pfnblock_migratetype(page, pfn); >> + __free_one_page(page, pfn, zone, pageblock_order, mt, fpi_flags); >> + pfn += pageblock_nr_pages; >> page = pfn_to_page(pfn); > > This predates your patch, but seems potentially dangerous to attempt > pfn_to_page(end_pfn) with SPARSEMEM and no vmemmap and the end_pfn perhaps > being just outside of the valid range? Should we change that? > > But seems this code was initially introduced as part of Johannes' > migratetype hygiene series. It starts as split_free_page() from commit b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity”), but harmless since it is only used to split a buddy page. Then commit fd919a85cd55 ("mm: page_isolation: prepare for hygienic freelists") refactored it, which should be fine, since it is still used for the same purpose in page isolation. Then commit e98337d11bbd ("mm/contig_alloc: support __GFP_COMP") used it for gigantic hugetlb. For SPARSEMEM && !SPARSEMEM_VMEMMAP, PFNs are contiguous, vmemmap might not be. The code above using pfn in the loop might be fine. And since order is provided, unless the caller is providing a falsely large order, pfn should be valid. Or am I missing anything? Best Regards, Yan, Zi