On Tue, Aug 10, 2021 at 6:40 AM Jens Axboe <axboe@xxxxxxxxx> wrote: > > Initialize a bio allocation cache, and mark it as being used for > IOPOLL. We could use it for non-polled IO as well, but it'd need some > locking and probably would negate much of the win in that case. For regular (non-polled) IO, will it make sense to tie a bio-cache to each fixed-buffer slot (ctx->user_bufs array). One bio cache (along with the lock) per slot. That may localize the lock contention. And it will happen only when multiple IOs are spawned from the same fixed-buffer concurrently? > We start with IOPOLL, as completions are locked by the ctx lock anyway. > So no further locking is needed there. > > This brings an IOPOLL gen2 Optane QD=128 workload from ~3.0M IOPS to > ~3.25M IOPS. -- Kanchan