Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> writes: > On Wed, Sep 20, 2023 at 02:18:49PM +0800, Huang Ying wrote: >> In commit f26b3fa04611 ("mm/page_alloc: limit number of high-order >> pages on PCP during bulk free"), the PCP (Per-CPU Pageset) will be >> drained when PCP is mostly used for high-order pages freeing to >> improve the cache-hot pages reusing between page allocating and >> freeing CPUs. >> >> On system with small per-CPU data cache, pages shouldn't be cached >> before draining to guarantee cache-hot. But on a system with large >> per-CPU data cache, more pages can be cached before draining to reduce >> zone lock contention. >> >> So, in this patch, instead of draining without any caching, "batch" >> pages will be cached in PCP before draining if the per-CPU data cache >> size is more than "4 * batch". >> >> On a 2-socket Intel server with 128 logical CPU, with the patch, the >> network bandwidth of the UNIX (AF_UNIX) test case of lmbench test >> suite with 16-pair processes increase 72.2%. The cycles% of the >> spinlock contention (mostly for zone lock) decreases from 45.8% to >> 21.2%. The number of PCP draining for high order pages >> freeing (free_high) decreases 89.8%. The cache miss rate keeps 0.3%. >> >> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> > > Acked-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> > > However, the flag should also have been documented to make it clear that > it preserves some pages on the PCP if the cache is large enough. Sure. Will do this. > Similar > to the previous patch, it would have been easier to reason about in the > general case if the decision had only been based on the LLC without > having to worry if any intermediate layer has a meaningful impact that > varies across CPU implementations. Sure. Will do this. -- Best Regards, Huang, Ying