Re: [RESEND PATCH 4/5] loop: try to handle loop aio command via NOWAIT IO first

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 10, 2025 at 12:14:44PM +0100, Christoph Hellwig wrote:
> On Sun, Mar 09, 2025 at 12:23:08AM +0800, Ming Lei wrote:
> > Try to handle loop aio command via NOWAIT IO first, then we can avoid to
> > queue the aio command into workqueue.
> > 
> > Fallback to workqueue in case of -EAGAIN.
> > 
> > BLK_MQ_F_BLOCKING has to be set for calling into .read_iter() or
> > .write_iter() which might sleep even though it is NOWAIT.
> 
> This needs performance numbers (or other reasons) justifying the
> change, especially as BLK_MQ_F_BLOCKING is a bit of an overhead.

The difference is just rcu_read_lock() vs. srcu_read_lock(), and not
see any difference in typical fio workload on loop device, and the gain
is pretty obvious, bandwidth is increased by > 4X in aio workloads:

https://lore.kernel.org/linux-block/f7c9d956-2b9b-8bb4-aa49-d57323fc8eb0@xxxxxxxxxx/T/#md3a6154218cb6619d8af5432cf2dd3a4a7a3dcc6

> 
> >  static DEFINE_IDR(loop_index_idr);
> >  static DEFINE_MUTEX(loop_ctl_mutex);
> >  static DEFINE_MUTEX(loop_validate_mutex);
> > @@ -380,8 +382,17 @@ static void lo_rw_aio_do_completion(struct loop_cmd *cmd)
> >  
> >  	if (!atomic_dec_and_test(&cmd->ref))
> >  		return;
> > +
> > +	if (cmd->ret == -EAGAIN) {
> > +		struct loop_device *lo = rq->q->queuedata;
> > +
> > +		loop_queue_work(lo, cmd);
> > +		return;
> > +	}
> 
> This looks like the wrong place for the rety, as -EAGAIN can only come from
> the submissions path.  i.e. we should never make it to the full completion
> path for that case.

That is not true, at least for XFS:

[root@ktest-40 io]# bpftrace -e 'kretfunc:lo_rw_aio_complete /args->ret == -11/ { @eagain[kstack] = count() } '
Attaching 1 probe...
^C

@eagain[
    bpf_prog_6deef7357e7b4530_sd_fw_ingress+28250
    bpf_prog_6deef7357e7b4530_sd_fw_ingress+28250
    bpf_trampoline_367219848433+108
    lo_rw_aio_complete+9
    blkdev_bio_end_io_async+63
    bio_submit_split+347
    blk_mq_submit_bio+395
    __submit_bio+116
    submit_bio_noacct_nocheck+773
    submit_bio_wait+87
    xfs_rw_bdev+348
    xlog_do_io+131
    xlog_write_log_records+451
    xlog_find_tail+829
    xlog_recover+61
    xfs_log_mount+259
    xfs_mountfs+1232
    xfs_fs_fill_super+1507
    get_tree_bdev_flags+303
    vfs_get_tree+38
    vfs_cmd_create+89
    __do_sys_fsconfig+1286
    do_syscall_64+130
    entry_SYSCALL_64_after_hwframe+118
]: 2


> 
> >  static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd, loff_t pos)
> > +{
> > +	unsigned int nr_bvec = lo_cmd_nr_bvec(cmd);
> > +	int ret;
> > +
> > +	cmd->iocb.ki_flags &= ~IOCB_NOWAIT;
> > +	ret = lo_submit_rw_aio(lo, cmd, pos, nr_bvec);
> > +	if (ret != -EIOCBQUEUED)
> > +		lo_rw_aio_complete(&cmd->iocb, ret);
> > +	return 0;
> 
> This needs an explanation that it is for the fallback path and thus
> clears the nowait flag.

OK.

> 
> > +}
> > +
> > +static int lo_rw_aio_nowait(struct loop_device *lo, struct loop_cmd *cmd, loff_t pos)
> 
> Overly long line.
> 
> > @@ -1926,6 +1955,17 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
> >  		break;
> >  	}
> >  
> > +	if (cmd->use_aio) {
> > +		loff_t pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
> > +		int ret = lo_rw_aio_nowait(lo, cmd, pos);
> > +
> > +		if (!ret)
> > +			return BLK_STS_OK;
> > +		if (ret != -EAGAIN)
> > +			return BLK_STS_IOERR;
> > +		/* fallback to workqueue for handling aio */
> > +	}
> 
> Why isn't all the logic in this branch in lo_rw_aio_nowait?

Good catch, I just found we have BLK_STS_AGAIN.


Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux