On Wed, 26 Jun 2024, Karim Manaouil wrote:
Maybe you mean turning ZONE_NORMAL into an array with each entry pointing to a smaller ZONE_NORMAL region of, let's say, 64GiB or smthng. Or it could be divided by the number of CPUs within the NUMA node and each CPU will be given one ZONE_NORMAL segment with a fallback list to other CPUs segments in case it runs out of memory. Does that make sense?
More zones means longer zonelists for the page allocator to walk during memory allocation. VM statistics are also per zone so that would decrease counter cacheline contention. Would be good for scaling VM things in general.