On Tue, Mar 11, 2025 at 12:58:46AM -0700, Christoph Hellwig wrote: > On Tue, Mar 11, 2025 at 09:33:01AM +0800, Ming Lei wrote: > > On Mon, Mar 10, 2025 at 12:14:44PM +0100, Christoph Hellwig wrote: > > > On Sun, Mar 09, 2025 at 12:23:08AM +0800, Ming Lei wrote: > > > > Try to handle loop aio command via NOWAIT IO first, then we can avoid to > > > > queue the aio command into workqueue. > > > > > > > > Fallback to workqueue in case of -EAGAIN. > > > > > > > > BLK_MQ_F_BLOCKING has to be set for calling into .read_iter() or > > > > .write_iter() which might sleep even though it is NOWAIT. > > > > > > This needs performance numbers (or other reasons) justifying the > > > change, especially as BLK_MQ_F_BLOCKING is a bit of an overhead. > > > > The difference is just rcu_read_lock() vs. srcu_read_lock(), and not > > Not, it also includes offloading to kblockd in more cases. But loop doesn't run into these cases: blk_execute_rq_nowait(): blk_mq_run_hw_queue(hctx, hctx->flags & BLK_MQ_F_BLOCKING); blk_mq_start_hw_queue(struct blk_mq_hw_ctx *hctx) blk_mq_run_hw_queue(hctx, hctx->flags & BLK_MQ_F_BLOCKING); blk_mq_start_stopped_hw_queues ... > > > > > > > see any difference in typical fio workload on loop device, and the gain > > is pretty obvious, bandwidth is increased by > 4X in aio workloads: > > > > https://lore.kernel.org/linux-block/f7c9d956-2b9b-8bb4-aa49-d57323fc8eb0@xxxxxxxxxx/T/#md3a6154218cb6619d8af5432cf2dd3a4a7a3dcc6 > > Please document that in the commit log. > > > > > + if (cmd->ret == -EAGAIN) { > > > > + struct loop_device *lo = rq->q->queuedata; > > > > + > > > > + loop_queue_work(lo, cmd); > > > > + return; > > > > + } > > > > > > This looks like the wrong place for the rety, as -EAGAIN can only come from > > > the submissions path. i.e. we should never make it to the full completion > > > path for that case. > > > > That is not true, at least for XFS: > > Your trace sees lo_rw_aio_complete called from the block layer > splitting called from loop, I see nothing about XFS there. But yes, > this shows the issue discussed last week in the iomap IOCB_NOWAIT > thread. Looks I mis-parse the stack, but it is still returned from blkdev's ->ki_complete(), and need to be handled. Thanks, Ming