Hi, This is v2 of this patchset. The main change from v1 is that we're no longer passing the cache pointer in struct kiocb, and the primary reason for that is to avoid growing it by 8 bytes. That would take it over one cacheline, and that is a noticeable slowdown for hot users of kiocb. Hence this was re-architected to store it in the per-task io_uring structure instead. Only real downside of that imho is that we need calls to get it, and that it's obviously then io_uring specific rather than being able to have multiple users of this. The latter I don't consider a big problem, as nobody else supports async polled IO anyway. The tldr; here is that we get about a 10% bump in polled performance with this patchset, as we can recycle bio structures essentially for free. Outside of that, explanations in each patch. I've also got an iomap patch, but trying to keep this single user until there's agreement on the direction. Against for-5.15/io_uring, and can also be found in my io_uring-bio-cache.2 branch. block/bio.c | 126 +++++++++++++++++++++++++++++++++++---- fs/block_dev.c | 30 ++++++++-- fs/io_uring.c | 52 ++++++++++++++++ include/linux/bio.h | 24 ++++++-- include/linux/fs.h | 2 + include/linux/io_uring.h | 7 +++ 6 files changed, 221 insertions(+), 20 deletions(-) -- Jens Axboe