On Tue, 8 Sep 2015 12:32:40 -0500 (CDT) Christoph Lameter <cl@xxxxxxxxx> wrote: > On Sat, 5 Sep 2015, Jesper Dangaard Brouer wrote: > > > The double_cmpxchg without lock prefix still cost 9 cycles, which is > > very fast but still a cost (add approx 19 cycles for a lock prefix). > > > > It is slower than local_irq_disable + local_irq_enable that only cost > > 7 cycles, which the bulking call uses. (That is the reason bulk calls > > with 1 object can almost compete with fastpath). > > Hmmm... Guess we need to come up with distinct version of kmalloc() for > irq and non irq contexts to take advantage of that . Most at non irq > context anyways. I agree, it would be an easy win. Do notice this will have the most impact for the slAb allocator. I estimate alloc + free cost would save: * slAb would save approx 60 cycles * slUb would save approx 4 cycles We might consider keeping the slUb approach as it would be more friendly for RT with less IRQ disabling. -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>