On Thu, 15 May 2008, Zhang, Yanmin wrote: > > It can thrash cachelines if objects from the same slab page are freed > > simultaneously on multiple processors. That occurred in the hackbench > > regression that we addressed with the dynamic configuration of slab sizes. > hackbench regression is because of slow allocation instead of slow freeing. > With ÿÿdynamic configuration of slab sizes, fast allocation becomes 97% (the bad > one is 68%), but fast free is always 8~9% with/without the patch. Thanks for using the slab statistics. I wish I had these numbers for the TPC benchmark. That would allow us to understand what is going on while it is running. The frees in the hackbench were slow because partial list updates occurred to frequently. The first fix was to let slab sit longer on the partial list. The other was the increase of the slab sizes which also increases the per cpu slab size and therefore the objects allocatable without a round trip to the page allocator. Freeing to a per cpu slab never requires partial list updates. So the frees also benefitted from the larger slab sizes. But the effect shows up in the count of partial list updates not in the fast/free collumn.