On 07.04.22 14:32, Mel Gorman wrote:
On Thu, Apr 07, 2022 at 01:17:19PM +0200, Juergen Gross wrote:On 07.04.22 13:07, Michal Hocko wrote:On Thu 07-04-22 12:45:41, Juergen Gross wrote:On 07.04.22 12:34, Michal Hocko wrote:Ccing Mel On Thu 07-04-22 11:32:21, Juergen Gross wrote:Since commit 9d3be21bf9c0 ("mm, page_alloc: simplify zonelist initialization") only zones with free memory are included in a built zonelist. This is problematic when e.g. all memory of a zone has been ballooned out.What is the actual problem there?When running as Xen guest new hotplugged memory will not be onlined automatically, but only on special request. This is done in order to support adding e.g. the possibility to use another GB of memory, while adding only a part of that memory initially. In case adding that memory is populating a new zone, the page allocator won't be able to use this memory when it is onlined, as the zone wasn't added to the zonelist, due to managed_zone() returning 0.How is that memory onlined? Because "regular" onlining (online_pages()) does rebuild zonelists if their zone hasn't been populated before.The Xen balloon driver has an own callback for onlining pages. The pages are just added to the ballooned-out page list without handing them to the allocator. This is done only when the guest is ballooned up.Is this new behaviour? I ask because keeping !managed_zones out of the
For some time (since kernel 5.9) Xen is using the zone device functionality with memremap_pages() and pgmap->type = MEMORY_DEVICE_GENERIC.
zonelist and reclaim paths and the behaviour makes sense. Elsewhere you state "zone can always happen to have no free memory left" and this is true but it's usually a transient event. The difference between a populated
And if this "transient event" is just happening when the zonelists are being rebuilt the zone will be off the lists maybe forever.
vs managed zone is usually permanent event where no memory will ever be placed on the buddy lists because the memory was reserved early in boot or a similar reason. The patch is probably harmless but it has the potential to waste CPUs allocating or reclaiming from zones that will never succeed.
I'd recommend to have an explicit flag per-zone for this case if you really care about that. This would be much cleaner than to imply from no free page being present at a specific point in time, that the zone will never be subject to memory allocation. Juergen
Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature