hi,
On 5/28/20 3:15 AM, Xiaoguang Wang wrote:
If requests can be submitted and completed inline, we don't need to
initialize whole io_wq_work in io_init_req(), which is an expensive
operation, add a new 'REQ_F_WORK_INITIALIZED' to control whether
io_wq_work is initialized.
I use /dev/nullb0 to evaluate performance improvement in my physical
machine:
modprobe null_blk nr_devices=1 completion_nsec=0
sudo taskset -c 60 fio -name=fiotest -filename=/dev/nullb0 -iodepth=128
-thread -rw=read -ioengine=io_uring -direct=1 -bs=4k -size=100G -numjobs=1
-time_based -runtime=120
before this patch:
Run status group 0 (all jobs):
READ: bw=724MiB/s (759MB/s), 724MiB/s-724MiB/s (759MB/s-759MB/s),
io=84.8GiB (91.1GB), run=120001-120001msec
With this patch:
Run status group 0 (all jobs):
READ: bw=761MiB/s (798MB/s), 761MiB/s-761MiB/s (798MB/s-798MB/s),
io=89.2GiB (95.8GB), run=120001-120001msec
About 5% improvement.
I think this is a big enough of a win to warrant looking closer
at this. Just a quick comment from me so far:
Yeah, to be honest, I did't expect that we get this some obvious improvement.
But I have run multiple rounds of same tests, I always get similar improvement,
if you have some free time, you can have a test :)
@@ -2923,7 +2943,10 @@ static int io_fsync(struct io_kiocb *req, bool force_nonblock)
{
/* fsync always requires a blocking context */
if (force_nonblock) {
- req->work.func = io_fsync_finish;
+ if (!(req->flags & REQ_F_WORK_INITIALIZED))
+ init_io_work(req, io_fsync_finish);
+ else
+ req->work.func = io_fsync_finish;
This pattern is repeated enough to warrant a helper, ala:
static void io_req_init_async(req, func)
{
if (req->flags & REQ_F_WORK_INITIALIZED)
req->work.func = func;
else
init_io_work(req, func);
}
also swapped the conditions, I tend to find it easier to read without
the negation.
Thanks for your suggestions. I'll prepare a V4 soon.
Regards,
Xiaoguang Wang