Here are the results of my research. One doc is an overview fo the data and the other is a pdf of the raw data. https://drive.google.com/file/d/1DE8QMri1Rsr7L27fORHFCmwgrMtdfPfu/view?usp=share_link https://drive.google.com/file/d/1UwnTeqsKB0jgpnZodJ0_cM2bOHx5aR_v/view?usp=share_link On Thu, Apr 27, 2023 at 4:29 AM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > On 4/5/23 21:54, Binder Makin wrote: > > I'm still running tests to explore some of these questions. > > The machines I am using are roughly as follows. > > > > Intel dual socket 56 total cores > > 192-384GB ram > > LEVEL1_ICACHE_SIZE 32768 > > LEVEL1_DCACHE_SIZE 32768 > > LEVEL2_CACHE_SIZE 1048576 > > LEVEL3_CACHE_SIZE 40370176 > > > > Amd dual socket 128 total cores > > 1TB ram > > LEVEL1_ICACHE_SIZE 32768 > > LEVEL1_DCACHE_SIZE 32768 > > LEVEL2_CACHE_SIZE 524288 > > LEVEL3_CACHE_SIZE 268435456 > > > > Arm single socket 64 total cores > > 256GB rma > > LEVEL1_ICACHE_SIZE 65536 > > LEVEL1_DCACHE_SIZE 65536 > > LEVEL2_CACHE_SIZE 1048576 > > LEVEL3_CACHE_SIZE 33554432 > > So with "some artifact of different cache layout" I didn't mean the > different cache sizes of the processors, but possible differences how > objects end up placed in memory by SLAB vs SLUB causing them to collide in > the cache of cause false sharing less or more. This kind of interference can > make interpreting (micro)benchmark results hard. > > Anyway, how I'd hope to approach this topic would be that SLAB removal is > proposed, and anyone who opposes that because they can't switch from SLAB to > SLUB would describe why they can't. I'd hope the "why" to be based on > testing with actual workloads, not just benchmarks. Benchmarks are then of > course useful if they can indeed distill the reason why the actual workload > regresses, as then anyone can reproduce that locally and develop/test fixes > etc. My hope is that if some kind of regression is found (e.g. due to lack > of percpu array in SLUB), it can be dealt with by improving SLUB. > > Historically I recall that we (SUSE) objected somwhat to SLAB removal as our > distro kernels were using it, but we have switched since. Then networking > had concerns (possibly related to the lack percpu array) but seems bulk > allocations helped and they use SLUB these days [1]. And IIRC Google was > also sticking to SLAB, which led to some attempts to augment SLUB for those > workloads years ago, but those were never finished. So I'd be curious if we > should restart those effors or can just remove SLAB now. > > [1] https://lore.kernel.org/all/93665604-5420-be5d-2104-17850288b955@xxxxxxxxxx/ > >