On Sat, Oct 09, 2021 at 01:33:43AM +0100, Matthew Wilcox wrote: > On Sat, Oct 09, 2021 at 12:19:03AM +0000, Hyeonggon Yoo wrote: > > - Is there a reason that SLUB does not implement cache coloring? > > it will help utilizing hardware cache. Especially in block layer, > > they are literally *squeezing* its performance now. > > Have you tried turning off cache colouring in SLAB and seeing if > performance changes? My impression is that it's useful for caches > with low associativity (direct mapped / 2-way / 4-way), but loses > its effectiveness for caches with higher associativity. For example, > my laptop: > > L1 Data Cache: 48KB, 12-way associative, 64 byte line size > L1 Instruction Cache: 32KB, 8-way associative, 64 byte line size > L2 Unified Cache: 1280KB, 20-way associative, 64 byte line size > L3 Unified Cache: 12288KB, 12-way associative, 64 byte line size > > I very much doubt that cache colouring is still useful for this machine. On my machine, L1 Data Cache: 32KB, 8-way associative, 64 byte line size L1 Instruction Cache: 32KB, 8-way associative, 64 byte line size L2 Unified Cache: 1MB, 16-way associative, 64 byte line size L3 Unified Cache: 33MB, 11-way associative, 64 byte line size I run hackbench with per-node coloring, per-cpu coloring, and without coloring. hackbench -g 100 -l 200000 without coloring: 2196.787 with per-node coloring: 2193.607 with per-cpu coloring: 2198.076 it seems there is almost no difference. How much difference did you seen low associativity processors? Hmm... I'm gonna search if there's related paper.