On Wed, Dec 15, 2021 at 09:30:09AM -0700, Jens Axboe wrote: > We currently cannot use the bio recycling allocation cache for IRQ driven > IO, as the cache isn't IRQ safe (by design). > > Add a way for the completion side to pass back a bio that needs freeing, > so we can do it from the io_uring side. io_uring completions always > run in task context. > > This is good for about a 13% improvement in IRQ driven IO, taking us from > around 6.3M/core to 7.1M/core IOPS. The numbers looks great, but I really hate how it ties the caller into using a bio. I'll have to think hard about a better structure. Just curious: are the numbers with retpolines or without? Do you care about the cost of indirect calls with retpolines for these benchmarks?