On 5/28/20 3:15 AM, Xiaoguang Wang wrote: > If requests can be submitted and completed inline, we don't need to > initialize whole io_wq_work in io_init_req(), which is an expensive > operation, add a new 'REQ_F_WORK_INITIALIZED' to control whether > io_wq_work is initialized. > > I use /dev/nullb0 to evaluate performance improvement in my physical > machine: > modprobe null_blk nr_devices=1 completion_nsec=0 > sudo taskset -c 60 fio -name=fiotest -filename=/dev/nullb0 -iodepth=128 > -thread -rw=read -ioengine=io_uring -direct=1 -bs=4k -size=100G -numjobs=1 > -time_based -runtime=120 > > before this patch: > Run status group 0 (all jobs): > READ: bw=724MiB/s (759MB/s), 724MiB/s-724MiB/s (759MB/s-759MB/s), > io=84.8GiB (91.1GB), run=120001-120001msec > > With this patch: > Run status group 0 (all jobs): > READ: bw=761MiB/s (798MB/s), 761MiB/s-761MiB/s (798MB/s-798MB/s), > io=89.2GiB (95.8GB), run=120001-120001msec > > About 5% improvement. I think this is a big enough of a win to warrant looking closer at this. Just a quick comment from me so far: > @@ -2923,7 +2943,10 @@ static int io_fsync(struct io_kiocb *req, bool force_nonblock) > { > /* fsync always requires a blocking context */ > if (force_nonblock) { > - req->work.func = io_fsync_finish; > + if (!(req->flags & REQ_F_WORK_INITIALIZED)) > + init_io_work(req, io_fsync_finish); > + else > + req->work.func = io_fsync_finish; This pattern is repeated enough to warrant a helper, ala: static void io_req_init_async(req, func) { if (req->flags & REQ_F_WORK_INITIALIZED) req->work.func = func; else init_io_work(req, func); } also swapped the conditions, I tend to find it easier to read without the negation. -- Jens Axboe