On Sat, Dec 15, 2012 at 02:16:48PM +0100, Jens Axboe wrote: > On 2012-12-15 11:36, Kent Overstreet wrote: > >> Knock yourself out - I already took a quick look at it, and conversion > >> should be pretty simple. It's the mtip32xx driver, it's in the kernel. I > >> would suggest getting rid of the ->async_callback() (since it's always > >> bio_endio()) since that'll make it cleaner. > > > > Just pushed my conversion - it's untested, but it's pretty > > straightforward. > > You forgot a batch_complete_init(). With that, it works. Single device > is ~1050K now, so still slower than jaio without batching (which was > ~1220K). But it's an improvement over kaio-dio, which was roughly ~930K > IOPS. Curious... if the device is delivering a reasonable number of completions per interrupt, I would've expected that to help more (it made a huge difference for me). Now I'm really curious where the difference is coming from. It's possible something I did introduced a performance regression you're uncovering (i.e. I reordered stuff in struct kiocb to shrink it, not sure if you were testing with those changes). It sounds like the mtip32xx driver is better/more efficient than anything I can test with, so if so it's entirely possible you're seing it due to less noise there. Or maybe just getting rid of the ringbuffer is that awesome. Gonna try and work on combining our optimizations so I can see what that looks like :) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html