There is no reliable way to submit and wait in a single syscall, as io_submit_sqes() may under-consume sqes (in case of an early error). Then it will wait for not-yet-submitted requests, deadlocking the user in most cases. In such cases adjust min_complete, so it won't wait for more than what have been submitted in the current call to io_uring_enter(). It may be less than totally in-flight including previous submissions, but this shouldn't do harm and up to a user. Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> --- v2: cap min_complete if submitted partially (Jens Axboe) v3: update commit message (Jens Axboe) fs/io_uring.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 81219a631a6d..5dfc805ec31c 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3763,11 +3763,8 @@ static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr, unsigned int sqe_flags; req = io_get_req(ctx, statep); - if (unlikely(!req)) { - if (!submitted) - submitted = -EAGAIN; + if (unlikely(!req)) break; - } if (!io_get_sqring(ctx, req)) { __io_free_req(req); break; @@ -5272,6 +5269,14 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, submitted = io_submit_sqes(ctx, to_submit, f.file, fd, &cur_mm, false); mutex_unlock(&ctx->uring_lock); + + if (submitted != to_submit) { + if (!submitted) { + submitted = -EAGAIN; + goto done; + } + min_complete = min(min_complete, (u32)submitted); + } } if (flags & IORING_ENTER_GETEVENTS) { unsigned nr_events = 0; @@ -5284,7 +5289,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, ret = io_cqring_wait(ctx, min_complete, sig, sigsz); } } - +done: percpu_ref_put(&ctx->refs); out_fput: fdput(f); -- 2.24.0