Re: [GIT PULL] Lockless SLUB slowpaths for v3.1-rc1

David Rientjes <rientjes@xxxxxxxxxx> · Mon, 1 Aug 2011 19:43:35 -0700 (PDT)

On Mon, 1 Aug 2011, Pekka Enberg wrote:

> Looking at the data (in slightly reorganized form):
> 
>   alloc
>   =====
> 
>     16 threads:
> 
>       cache           alloc_fastpath          alloc_slowpath
>       kmalloc-256     4263275 (91.1%)         417445   (8.9%)
>       kmalloc-1024    4636360 (99.1%)         42091    (0.9%)
>       kmalloc-4096    2570312 (54.4%)         2155946  (45.6%)
> 
>     160 threads:
> 
>       cache           alloc_fastpath          alloc_slowpath
>       kmalloc-256     10937512 (62.8%)        6490753  (37.2%)
>       kmalloc-1024    17121172 (98.3%)        303547   (1.7%)
>       kmalloc-4096    5526281  (31.7%)        11910454 (68.3%)
> 
>   free
>   ====
> 
>     16 threads:
> 
>       cache           free_fastpath           free_slowpath
>       kmalloc-256     210115   (4.5%)         4470604  (95.5%)
>       kmalloc-1024    3579699  (76.5%)        1098764  (23.5%)
>       kmalloc-4096    67616    (1.4%)         4658678  (98.6%)
> 
>     160 threads:
>       cache           free_fastpath           free_slowpath
>       kmalloc-256     15469    (0.1%)         17412798 (99.9%)
>       kmalloc-1024    11604742 (66.6%)        5819973  (33.4%)
>       kmalloc-4096    14848    (0.1%)         17421902 (99.9%)
> 
> it's pretty sad to see how SLUB alloc fastpath utilization drops so
> dramatically. Free fastpath utilization isn't all that great with 160
> threads either but it seems to me that most of the performance
> regression compared to SLAB still comes from the alloc paths.
> 

It's the opposite, the cumulative effects of the free slowpath is more 
costly in terms of latency than the alloc slowpath because it occurs at a 
greater frequency; the pattern that I described as "slab thrashing" before 
causes a single free to a full slab, manipulation to get it back on the 
partial list, then the alloc slowpath grabs it for a single allocation, 
and requires another partial slab on the next alloc.