On Thu, 16 Apr 2015 10:54:07 -0500 (CDT) Christoph Lameter <cl@xxxxxxxxx> wrote: > On Thu, 16 Apr 2015, Jesper Dangaard Brouer wrote: > > > On CPU E5-2630 @ 2.30GHz, the cost of kmem_cache_alloc + > > kmem_cache_free, is a tight loop (most optimal fast-path), cost 22ns. > > With elem size 256 bytes, where slab chooses to make 32 obj-per-slab. > > > > With this patch, testing different bulk sizes, the cost of alloc+free > > per element is improved for small sizes of bulk (which I guess this the > > is expected outcome). > > > > Have something to compare against, I also ran the bulk sizes through > > the fallback versions __kmem_cache_alloc_bulk() and > > __kmem_cache_free_bulk(), e.g. the none optimized versions. > > > > size -- optimized -- fallback > > bulk 8 -- 15ns -- 22ns > > bulk 16 -- 15ns -- 22ns > > Good. > > > bulk 30 -- 44ns -- 48ns > > bulk 32 -- 47ns -- 50ns > > bulk 64 -- 52ns -- 54ns > > Hmm.... We are hittling the atomics I guess... What you got so far is only > using the per cpu data. Wonder how many partial pages are available Ups, I can see that this kernel don't have CONFIG_SLUB_CPU_PARTIAL, I'll re-run tests with this enabled. -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>