Re: slub: bulk allocation from per cpu partial pages

Jesper Dangaard Brouer <brouer@xxxxxxxxxx> · Fri, 17 Apr 2015 07:44:46 +0200

On Thu, 16 Apr 2015 10:54:07 -0500 (CDT)
Christoph Lameter <cl@xxxxxxxxx> wrote:

> On Thu, 16 Apr 2015, Jesper Dangaard Brouer wrote:
> 
> > On CPU E5-2630 @ 2.30GHz, the cost of kmem_cache_alloc +
> > kmem_cache_free, is a tight loop (most optimal fast-path), cost 22ns.
> > With elem size 256 bytes, where slab chooses to make 32 obj-per-slab.
> >
> > With this patch, testing different bulk sizes, the cost of alloc+free
> > per element is improved for small sizes of bulk (which I guess this the
> > is expected outcome).
> >
> > Have something to compare against, I also ran the bulk sizes through
> > the fallback versions __kmem_cache_alloc_bulk() and
> > __kmem_cache_free_bulk(), e.g. the none optimized versions.
> >
> >  size    --  optimized -- fallback
> >  bulk  8 --  15ns      --  22ns
> >  bulk 16 --  15ns      --  22ns
> 
> Good.
> 
> >  bulk 30 --  44ns      --  48ns
> >  bulk 32 --  47ns      --  50ns
> >  bulk 64 --  52ns      --  54ns
> 
> Hmm.... We are hittling the atomics I guess... What you got so far is only
> using the per cpu data. Wonder how many partial pages are available

Ups, I can see that this kernel don't have CONFIG_SLUB_CPU_PARTIAL,
I'll re-run tests with this enabled.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>