Running local task_work requires taking uring_lock, for submit + wait we can try to run them right after submit while we still hold the lock and save one lock/unlokc pair. The optimisation was implemented in the first local tw patches but got dropped for simplicity. Suggested-by: Dylan Yudaken <dylany@xxxxxx> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> --- io_uring/io_uring.c | 12 ++++++++++-- io_uring/io_uring.h | 7 +++++++ 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 355fc1f3083d..b092473eca1d 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3224,8 +3224,16 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, mutex_unlock(&ctx->uring_lock); goto out; } - if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll) - goto iopoll_locked; + if (flags & IORING_ENTER_GETEVENTS) { + if (ctx->syscall_iopoll) + goto iopoll_locked; + /* + * Ignore errors, we'll soon call io_cqring_wait() and + * it should handle ownership problems if any. + */ + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) + (void)io_run_local_work_locked(ctx); + } mutex_unlock(&ctx->uring_lock); } diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index e733d31f31d2..8504bc1f3839 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -275,6 +275,13 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx) return ret; } +static inline int io_run_local_work_locked(struct io_ring_ctx *ctx) +{ + if (llist_empty(&ctx->work_llist)) + return 0; + return __io_run_local_work(ctx, true); +} + static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked) { if (!*locked) { -- 2.37.3