On Fri 19-06-20 14:59:21, David Hildenbrand wrote: > It's not completely obvious why we have to shuffle the complete zone, as > some sort of shuffling is already performed when onlining pages via > __free_one_page(), placing MAX_ORDER-1 pages either to the head or the tail > of the freelist. Let's document why we have to shuffle the complete zone > when exposing larger, contiguous physical memory areas to the buddy. > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxx> > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> OK, this is an improvement. I would still prefer to have this claim backed by some numbers but it seems we are not going to get any so we can at least pretend to try as hard as possible especially when this is not a hot path. Acked-by: Michal Hocko <mhocko@xxxxxxxx> > --- > mm/memory_hotplug.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 9b34e03e730a4..a0d81d404823d 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -822,6 +822,14 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, > zone->zone_pgdat->node_present_pages += onlined_pages; > pgdat_resize_unlock(zone->zone_pgdat, &flags); > > + /* > + * When exposing larger, physically contiguous memory areas to the > + * buddy, shuffling in the buddy (when freeing onlined pages, putting > + * them either to the head or the tail of the freelist) is only helpful > + * for mainining the shuffle, but not for creating the initial shuffle. > + * Shuffle the whole zone to make sure the just onlined pages are > + * properly distributed across the whole freelist. > + */ > shuffle_zone(zone); > > node_states_set_node(nid, &arg); > -- > 2.26.2 -- Michal Hocko SUSE Labs