On Mon, Apr 12, 2021 at 04:12:11PM +0200, David Hildenbrand wrote: > > After v1 of the patch, the race was reduced to the point between the > > zone watermark check and the rmqueue_pcplist but yes, it still existed. > > Closing it completely was either complex or expensive. Setting > > zone->pageset = &boot_pageset before the free would shrink the race > > further but that still leaves a potential memory ordering issue. > > > > While fixable, it's either complex, expensive or both so yes, just leaving > > the pageset structures in place would be much more straight-forward > > assuming the structures were not allocated in the zone that is being > > hot-removed. As things stand, I had trouble even testing zone hot-remove > > as there was always a few pages left behind and I did not chase down > > why. > > Can you elaborate? I can reliably trigger zone present pages going to 0 by > just hotplugging a DIMM, onlining the memory block devices to the MOVABLE > zone, followed by offlining the memory block again. > For the machine I was testing on, I tried offlining all memory within a zone on a NUMA machine. Even if I used movable_zone to create a zone or numa=fake to create multiple fake nodes and zones, there was always either reserved or pinned pages preventing the full zone being removed. -- Mel Gorman SUSE Labs