Re: switch block layer polling to a bio based model

Jens Axboe <axboe@xxxxxxxxx> · Mon, 26 Apr 2021 10:48:04 -0600

On 4/26/21 10:15 AM, Christoph Hellwig wrote:
> On Mon, Apr 26, 2021 at 09:12:09AM -0600, Jens Axboe wrote:
>> Here's the series. It's not super clean (yet), but basically allows
>> users like io_uring to setup a bio cache, and pass that in through
>> iocb->ki_bi_cache. With that, we can recycle them instead of going
>> through free+alloc continually. If you look at profiles for high iops,
>> we're spending more time than desired doing just that.
>>
>> https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-bio-cache
> 
> So where do you spend the cycles?  The do not memset the whole bio
> optimization is pretty obvious and is someting we should do independent
> of the allocator.

memset is just a small optimization on top. If we look at current
profiles, the alloc+free looks something ala:

+    2.71%  io_uring  [kernel.vmlinux]  [k] bio_alloc_bioset
+    2.03%  io_uring  [kernel.vmlinux]  [k] kmem_cache_alloc

and

+    2.82%  io_uring  [kernel.vmlinux]  [k] __slab_free
+    1.73%  io_uring  [kernel.vmlinux]  [k] kmem_cache_free
     0.36%  io_uring  [kernel.vmlinux]  [k] mempool_free_slab
     0.27%  io_uring  [kernel.vmlinux]  [k] mempool_free

Which is a substantial amount of cycles that is needed just to
repeatedly use the same set of bios for doing IO. Using the caching
patchset, all of the above are completely eliminated, and the only thing
we dynamically allocate is a request which is a lot cheaper (ends up
being 1-2% for either kernel).

> The other thing that sucks is the mempool implementation, as it forces
> each allocation and free to do an indirect call.  I think it might be
> worth to try to frontend it with a normal slab cache and only fall back
> to the mempool if that fails.

Also minor I believe, but yes it'll eat cycles too. FWIW, the testing
above is done without RETPOLINE.

-- 
Jens Axboe