On Sun, 31 Jul 2011, Pekka Enberg wrote: > > And although slub is definitely heading in the right direction regarding > > the netperf benchmark, it's still a non-starter for anybody using large > > NUMA machines for networking performance. On my 16-core, 4 node, 64GB > > client/server machines running netperf TCP_RR with various thread counts > > for 60 seconds each on 3.0: > > > > threads SLUB SLAB diff > > 16 76345 74973 - 1.8% > > 32 116380 116272 - 0.1% > > 48 150509 153703 + 2.1% > > 64 187984 189750 + 0.9% > > 80 216853 224471 + 3.5% > > 96 236640 249184 + 5.3% > > 112 256540 275464 + 7.4% > > 128 273027 296014 + 8.4% > > 144 281441 314791 +11.8% > > 160 287225 326941 +13.8% > > That looks like a pretty nasty scaling issue. David, would it be > possible to see 'perf report' for the 160 case? [ Maybe even 'perf > annotate' for the interesting SLUB functions. ] > More interesting than the perf report (which just shows kfree, kmem_cache_free, kmem_cache_alloc dominating) is the statistics that are exported by slub itself, it shows the "slab thrashing" issue that I described several times over the past few years. It's difficult to address because it's a result of slub's design. From the client side of 160 netperf TCP_RR threads for 60 seconds: cache alloc_fastpath alloc_slowpath kmalloc-256 10937512 (62.8%) 6490753 kmalloc-1024 17121172 (98.3%) 303547 kmalloc-4096 5526281 11910454 (68.3%) cache free_fastpath free_slowpath kmalloc-256 15469 17412798 (99.9%) kmalloc-1024 11604742 (66.6%) 5819973 kmalloc-4096 14848 17421902 (99.9%) With those stats, there's no way that slub will even be able to compete with slab because it's not optimized for the slowpath. There are ways to mitigate that, like with my slab thrashing patchset from a couple years ago that you tracked for a while that improved performance 3-4% at the overhead of an increment in the fastpath, but everything else requires more memory. You could preallocate the slabs on the partial list, increase the per-node min_partial, increase the order of the slabs themselves so you hit the free fastpath much more often, etc, but they all come at a considerable cost in memory. I'm very confident that slub could beat slab on any system if you throw enough memory at it because its fastpaths are extremely efficient, but there's no business case for that. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>