The patch titled Subject: mm/page_alloc: place pages to tail in __free_pages_core() has been added to the -mm tree. Its filename is mm-page_alloc-place-pages-to-tail-in-__free_pages_core.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-place-pages-to-tail-in-__free_pages_core.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-place-pages-to-tail-in-__free_pages_core.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: David Hildenbrand <david@xxxxxxxxxx> Subject: mm/page_alloc: place pages to tail in __free_pages_core() __free_pages_core() is used when exposing fresh memory to the buddy during system boot and when onlining memory in generic_online_page(). generic_online_page() is used in two cases: 1. Direct memory onlining in online_pages(). 2. Deferred memory onlining in memory-ballooning-like mechanisms (HyperV balloon and virtio-mem), when parts of a section are kept fake-offline to be fake-onlined later on. In 1, we already place pages to the tail of the freelist. Pages will be freed to MIGRATE_ISOLATE lists first and moved to the tail of the freelists via undo_isolate_page_range(). In 2, we currently don't implement a proper rule. In case of virtio-mem, where we currently always online MAX_ORDER - 1 pages, the pages will be placed to the HEAD of the freelist - undesireable. While the hyper-v balloon calls generic_online_page() with single pages, usually it will call it on successive single pages in a larger block. The pages are fresh, so place them to the tail of the freelist and avoid the PCP. In __free_pages_core(), remove the now superflouos call to set_page_refcounted() and add a comment regarding page initialization and the refcount. Note: In 2. we currently don't shuffle. If ever relevant (page shuffling is usually of limited use in virtualized environments), we might want to shuffle after a sequence of generic_online_page() calls in the relevant callers. Link: https://lkml.kernel.org/r/20201005121534.15649-5-david@xxxxxxxxxx Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx> Reviewed-by: Oscar Salvador <osalvador@xxxxxxx> Acked-by: Pankaj Gupta <pankaj.gupta.linux@xxxxxxxxx> Reviewed-by: Wei Yang <richard.weiyang@xxxxxxxxxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Cc: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: Mike Rapoport <rppt@xxxxxxxxxx> Cc: "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx> Cc: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> Cc: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx> Cc: Wei Liu <wei.liu@xxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Scott Cheloha <cheloha@xxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/page_alloc.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) --- a/mm/page_alloc.c~mm-page_alloc-place-pages-to-tail-in-__free_pages_core +++ a/mm/page_alloc.c @@ -275,7 +275,8 @@ bool pm_suspended_storage(void) unsigned int pageblock_order __read_mostly; #endif -static void __free_pages_ok(struct page *page, unsigned int order); +static void __free_pages_ok(struct page *page, unsigned int order, + fpi_t fpi_flags); /* * results with 256, 32 in the lowmem_reserve sysctl: @@ -687,7 +688,7 @@ out: void free_compound_page(struct page *page) { mem_cgroup_uncharge(page); - __free_pages_ok(page, compound_order(page)); + __free_pages_ok(page, compound_order(page), FPI_NONE); } void prep_compound_page(struct page *page, unsigned int order) @@ -1423,14 +1424,14 @@ static void free_pcppages_bulk(struct zo static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, - int migratetype) + int migratetype, fpi_t fpi_flags) { spin_lock(&zone->lock); if (unlikely(has_isolate_pageblock(zone) || is_migrate_isolate(migratetype))) { migratetype = get_pfnblock_migratetype(page, pfn); } - __free_one_page(page, pfn, zone, order, migratetype, FPI_NONE); + __free_one_page(page, pfn, zone, order, migratetype, fpi_flags); spin_unlock(&zone->lock); } @@ -1508,7 +1509,8 @@ void __meminit reserve_bootmem_region(ph } } -static void __free_pages_ok(struct page *page, unsigned int order) +static void __free_pages_ok(struct page *page, unsigned int order, + fpi_t fpi_flags) { unsigned long flags; int migratetype; @@ -1520,7 +1522,8 @@ static void __free_pages_ok(struct page migratetype = get_pfnblock_migratetype(page, pfn); local_irq_save(flags); __count_vm_events(PGFREE, 1 << order); - free_one_page(page_zone(page), page, pfn, order, migratetype); + free_one_page(page_zone(page), page, pfn, order, migratetype, + fpi_flags); local_irq_restore(flags); } @@ -1530,6 +1533,11 @@ void __free_pages_core(struct page *page struct page *p = page; unsigned int loop; + /* + * When initializing the memmap, __init_single_page() sets the refcount + * of all pages to 1 ("allocated"/"not free"). We have to set the + * refcount of all involved pages to 0. + */ prefetchw(p); for (loop = 0; loop < (nr_pages - 1); loop++, p++) { prefetchw(p + 1); @@ -1540,8 +1548,12 @@ void __free_pages_core(struct page *page set_page_count(p, 0); atomic_long_add(nr_pages, &page_zone(page)->managed_pages); - set_page_refcounted(page); - __free_pages(page, order); + + /* + * Bypass PCP and place fresh pages right to the tail, primarily + * relevant for memory onlining. + */ + __free_pages_ok(page, order, FPI_TO_TAIL); } #ifdef CONFIG_NEED_MULTIPLE_NODES @@ -3168,7 +3180,8 @@ static void free_unref_page_commit(struc */ if (migratetype >= MIGRATE_PCPTYPES) { if (unlikely(is_migrate_isolate(migratetype))) { - free_one_page(zone, page, pfn, 0, migratetype); + free_one_page(zone, page, pfn, 0, migratetype, + FPI_NONE); return; } migratetype = MIGRATE_MOVABLE; @@ -4991,7 +5004,7 @@ static inline void free_the_page(struct if (order == 0) /* Via pcp? */ free_unref_page(page); else - __free_pages_ok(page, order); + __free_pages_ok(page, order, FPI_NONE); } void __free_pages(struct page *page, unsigned int order) _ Patches currently in -mm which might be from david@xxxxxxxxxx are mm-page_alloc-tweak-comments-in-has_unmovable_pages.patch mm-page_isolation-exit-early-when-pageblock-is-isolated-in-set_migratetype_isolate.patch mm-page_isolation-drop-warn_on_once-in-set_migratetype_isolate.patch mm-page_isolation-cleanup-set_migratetype_isolate.patch virtio-mem-dont-special-case-zone_movable.patch mm-document-semantics-of-zone_movable.patch mm-memory_hotplug-inline-__offline_pages-into-offline_pages.patch mm-memory_hotplug-enforce-section-granularity-when-onlining-offlining.patch mm-memory_hotplug-simplify-page-offlining.patch mm-page_alloc-simplify-__offline_isolated_pages.patch mm-memory_hotplug-drop-nr_isolate_pageblock-in-offline_pages.patch mm-page_isolation-simplify-return-value-of-start_isolate_page_range.patch mm-memory_hotplug-simplify-page-onlining.patch mm-page_alloc-drop-stale-pageblock-comment-in-memmap_init_zone.patch mm-pass-migratetype-into-memmap_init_zone-and-move_pfn_range_to_zone.patch mm-memory_hotplug-mark-pageblocks-migrate_isolate-while-onlining-memory.patch kernel-resource-make-release_mem_region_adjustable-never-fail.patch kernel-resource-move-and-rename-ioresource_mem_driver_managed.patch mm-memory_hotplug-guard-more-declarations-by-config_memory_hotplug.patch mm-memory_hotplug-prepare-passing-flags-to-add_memory-and-friends.patch mm-memory_hotplug-memhp_merge_resource-to-specify-merging-of-system-ram-resources.patch virtio-mem-try-to-merge-system-ram-resources.patch xen-balloon-try-to-merge-system-ram-resources.patch hv_balloon-try-to-merge-system-ram-resources.patch kernel-resource-make-iomem_resource-implicit-in-release_mem_region_adjustable.patch mm-page_alloc-convert-report-flag-of-__free_one_page-to-a-proper-flag.patch mm-page_alloc-place-pages-to-tail-in-__putback_isolated_page.patch mm-page_alloc-move-pages-to-tail-in-move_to_free_list.patch mm-page_alloc-place-pages-to-tail-in-__free_pages_core.patch mm-memory_hotplug-update-comment-regarding-zone-shuffling.patch