On 5/21/21 3:28 AM, Mel Gorman wrote: > The PCP high watermark is based on the number of online CPUs so the > watermarks must be adjusted during CPU hotplug. At the time of > hot-remove, the number of online CPUs is already adjusted but during > hot-add, a delta needs to be applied to update PCP to the correct > value. After this patch is applied, the high watermarks are adjusted > correctly. > > # grep high: /proc/zoneinfo | tail -1 > high: 649 > # echo 0 > /sys/devices/system/cpu/cpu4/online > # grep high: /proc/zoneinfo | tail -1 > high: 664 > # echo 1 > /sys/devices/system/cpu/cpu4/online > # grep high: /proc/zoneinfo | tail -1 > high: 649 This is actually a comment more about the previous patch, but it doesn't really become apparent until the example above. In your example, you mentioned increased exit() performance by using "vm.percpu_pagelist_fraction to increase the pcp->high value". That's presumably because of the increased batching effects and fewer lock acquisitions. But, logically, doesn't that mean that, the more CPUs you have in a node, the *higher* you want pcp->high to be? If we took this to the extreme and had an absurd number of CPUs in a node, we could end up with a too-small pcp->high value. Also, do you worry at all about a zone with a low min_free_kbytes seeing increased zone lock contention? ... > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index bf5cdc466e6c..2761b03b3a44 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -6628,7 +6628,7 @@ static int zone_batchsize(struct zone *zone) > #endif > } > > -static int zone_highsize(struct zone *zone) > +static int zone_highsize(struct zone *zone, int cpu_online) > { > #ifdef CONFIG_MMU > int high; > @@ -6640,7 +6640,7 @@ static int zone_highsize(struct zone *zone) > * CPUs local to a zone. Note that early in boot that CPUs may > * not be online yet. > */ > - nr_local_cpus = max(1U, cpumask_weight(cpumask_of_node(zone_to_nid(zone)))); > + nr_local_cpus = max(1U, cpumask_weight(cpumask_of_node(zone_to_nid(zone)))) + cpu_online; > high = low_wmark_pages(zone) / nr_local_cpus; Is this "+ cpu_online" bias because the CPU isn't in cpumask_of_node() when the CPU hotplug callback occurs? If so, it might be nice to mention.