On Thu, 2008-05-15 at 10:05 -0700, Christoph Lameter wrote: > On Thu, 15 May 2008, Zhang, Yanmin wrote: > > > > It can thrash cachelines if objects from the same slab page are freed > > > simultaneously on multiple processors. That occurred in the hackbench > > > regression that we addressed with the dynamic configuration of slab sizes. > > hackbench regression is because of slow allocation instead of slow freeing. > > With ÿÿdynamic configuration of slab sizes, fast allocation becomes 97% (the bad > > one is 68%), but fast free is always 8~9% with/without the patch. > > Thanks for using the slab statistics. I wish I had these numbers for the > TPC benchmark. That would allow us to understand what is going on while it > is running. > > The frees in the hackbench were slow because partial list updates occurred > to frequently. The first fix was to let slab sit longer on the partial > list. I forgot that. 2.6.24 merged the patch. > The other was the increase of the slab sizes which also increases > the per cpu slab size and therefore the objects allocatable without a > round trip to the page allocator. That is what I am talking. 2.6.26-rc merged the patch. > Freeing to a per cpu slab never requires > partial list updates. So the frees also benefitted from the larger slab > sizes. But the effect shows up in the count of partial list updates not in > the fast/free collumn. I agree. It might be better if SLUB could be optimized again to have more consideration when the slow free percentage is high, because the page lock might ping-pong among processors if multi-processors access the same slab at the same time. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html