Add bio pcpu caching for IRQ-driven I/O. We extend the currently limited to iopoll REQ_ALLOC_CACHE infra. Benchmarked with t/io_uring and an Optane SSD: 2.22 -> 2.32 MIOPS for qd32 (+4.5%) and 2.60 vs 2.82 for qd128 (+8.4%). Works best with per-cpu queues, otherwise there might be some effects at play, e.g. bios allocated by one cpu but freed by another, but the worst case (always goes to mempool) doesn't show any performance degradation. Currently, it's only enabled for previous REQ_ALLOC_CACHE users but will be turned on system-wide later. v2: fix botched splicing threshold checks v3: remove merged patch limit scope of flags var in bio_put_percpu_cache v4: correct outdated comment fix in-irq put -> splice modifying the non-irq safe cache list fix alloc null dereference Pavel Begunkov (6): mempool: introduce mempool_is_saturated bio: don't rob starving biosets of bios bio: split pcpu cache part of bio_put into a helper bio: add pcpu caching for non-polling bio_put bio: shrink max number of pcpu cached bios io_uring/rw: enable bio caches for IRQ rw block/bio.c | 98 +++++++++++++++++++++++++++++++---------- include/linux/mempool.h | 5 +++ io_uring/rw.c | 3 +- 3 files changed, 82 insertions(+), 24 deletions(-) -- 2.38.0