On 4/26/21 9:06 AM, Christoph Hellwig wrote: > On Mon, Apr 26, 2021 at 08:57:31AM -0600, Jens Axboe wrote: >> I was separately curious about this as I have a (as of yet unposted) >> patchset that recycles bio allocations, as we spend quite a bit of time >> doing that for high rate polled IO. It's good for taking the above 2.97M >> IOPS to 3.2-3.3M IOPS, and it'd obviously be a bit more problematic with >> required RCU freeing of bio's. Even without the alloc cache, using RCU >> will ruin any potential cache locality on back-to-back bio free + bio >> alloc. > > That sucks indeed. How do you recycle the bios? If we make sure the Here's the series. It's not super clean (yet), but basically allows users like io_uring to setup a bio cache, and pass that in through iocb->ki_bi_cache. With that, we can recycle them instead of going through free+alloc continually. If you look at profiles for high iops, we're spending more time than desired doing just that. https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-bio-cache > bio is only ever recycled as a bio and bi_bdev remaings valid long > enough we might not need the rcu free. Even without your recycling > we could probably do something nasty using SLAB_TYPESAFE_BY_RCU. It would not be hard to restrict to same bdev for the cache, just one more check to do for recycling. Note that the caching series _only_ supports polled IO for now, as non-polled would require IRQ juggling for free+alloc and that will definitely take some of the win away and maybe even render it moot. Have yet to test that part out. Not a huge deal with the RCU free, as you end up doing that purely for polled IO and hence wouldn't impact the IRQ side of things negatively. -- Jens Axboe