On Thu, May 04, 2023 at 09:24:27AM -0700, Keith Busch wrote: > From: Keith Busch <kbusch@xxxxxxxxxx> > > io_uring tries to optimize allocating tags by hinting to the plug how > many it expects to need for a batch instead of allocating each tag > individually. But io_uring submission queueus may have a mix of many > devices for io, so the number of io's counted may be overestimated. This > can lead to allocating too many tags, which adds overhead to finding > that many contiguous tags, freeing up the ones we didn't use, and may > starve out other users that can actually use them. When running batched IO to multiple nvme drives, like with t/io_uring, this shows a tiny improvement in CPU utilization from avoiding the unlikely clean up condition in __blk_flush_plug() shown below: if (unlikely(!rq_list_empty(plug->cached_rq))) blk_mq_free_plug_rqs(plug);