On 4/19/23 3:22?AM, luhongfei wrote: > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index 4a865f0e85d0..64bb91beb4d6 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -2075,8 +2075,23 @@ static inline void io_queue_sqe(struct io_kiocb *req) > __must_hold(&req->ctx->uring_lock) > { > int ret; > + bool is_write; > > - ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER); > + switch (req->opcode) { > + case IORING_OP_WRITEV: > + case IORING_OP_WRITE_FIXED: > + case IORING_OP_WRITE: > + is_write = true; > + break; > + default: > + is_write = false; > + break; > + } > + > + if (!is_write || (req->rw.kiocb.ki_flags & IOCB_DIRECT)) > + ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER); > + else > + ret = io_issue_sqe(req, 0); > > /* > * We async punt it if the file wasn't marked NOWAIT, or if the file We really can't just do that, implicitly. What you are doing is making any of write synchronous. What are you writing to in terms of device or file? If file, what file system is being used? Curious if the target supports async buffered writes, guessing it does not which is why you see io-wq activity for all of them. That said, I did toss out a test patch a while back that explicitly sets up the ring such that we'll do blocking IO rather than do a non-blocking attempt and then punt it if that fails. And I do think there's a use case for that, in case you just want to use io_uring for batched syscalls and don't care about if you end up blocking for some IO. Let's do a primer on what happens for io_uring issue: 1) Non-blocking issue is attempted for IO. If successful, we're done for now. 2) Case 1 failed. Now we have two options a) We can poll the file. We arm poll, and we're done for now until that triggers. b) File cannot be polled, we punt to io-wq which then does a blocking attempt. For case 2b, this is the one where we could've just done a blocking attempt initially if the ring was setup with a flag explicitly saying that's what the application wants. Or io_uring_enter() had a flag passed in that explicitly said this is what the applications wants. I suspect we'll want both, to cover both SQPOLL and !SQPOLL. I'd recommend we still retain non-blocking issue for pollable files, as you could very quickly block forever otherwise. Imagine an empty pipe and a read issued to it in the blocking mode. A solution like that would cater to your case too, without potentially breaking a lot of things like your patch could. The key here is the explicit nature of it, we cannot just go and make odd assumptions about a particular opcode type (writes) and ring type (SQPOLL) and say "oh this one is fine for just ignoring blocking off the issue path". -- Jens Axboe