The buffered random write performance of io_uring is poor due to the following reason: By default, when performing buffered random writes, io_sq_thread will call io_issue_sqe writes req, but due to the setting of IO_URING_F_NONBLOCK, req is executed asynchronously in iou-wrk, where io_wq_submit_work calls io_issue_sqe completes the write req, with issue_flag as IO_URING_F_UNLOCKED | IO_URING_F_IOWQ, which will reduce performance. This patch will determine whether this req is a buffered random write, and if so, io_sq_thread directly calls io_issue_sqe(req, 0) completes req instead of completing it asynchronously in iou wrk. Performance results: For fio the following results have been obtained with a queue depth of 8 and 4k block size: random writes: without patch with patch libaio psync iops: 287k 560k 248K 324K bw: 1123MB/s 2188MB/s 970MB/s 1267MB/s clat: 52760ns 69918ns 28405ns 2109ns Signed-off-by: luhongfei <luhongfei@xxxxxxxx> --- io_uring/io_uring.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) mode change 100644 => 100755 io_uring/io_uring.c diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 4a865f0e85d0..64bb91beb4d6 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2075,8 +2075,23 @@ static inline void io_queue_sqe(struct io_kiocb *req) __must_hold(&req->ctx->uring_lock) { int ret; + bool is_write; - ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER); + switch (req->opcode) { + case IORING_OP_WRITEV: + case IORING_OP_WRITE_FIXED: + case IORING_OP_WRITE: + is_write = true; + break; + default: + is_write = false; + break; + } + + if (!is_write || (req->rw.kiocb.ki_flags & IOCB_DIRECT)) + ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER); + else + ret = io_issue_sqe(req, 0); /* * We async punt it if the file wasn't marked NOWAIT, or if the file -- 2.39.0