[PATCH v3 11/11] io_uring: optimise deferred tw execution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We needed fake nodes in __io_run_local_work() and to avoid unecessary wake
ups while the task already running task_works, but we don't need them
anymore since wake ups are protected by cq_waiting, which is always
cleared by the time we're executing deferred task_work items.

Note that because of loose sync around cq_waiting clearing
io_req_local_work_add() may wake the task more than once, but that's
fine and should be rare to not hurt perf.

Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
---
 io_uring/io_uring.c | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 9c95ceb1a9f2..1dd0fc0412c8 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1322,17 +1322,16 @@ static void __cold io_move_task_work_from_local(struct io_ring_ctx *ctx)
 static int __io_run_local_work(struct io_ring_ctx *ctx, bool *locked)
 {
 	struct llist_node *node;
-	struct llist_node fake;
-	struct llist_node *current_final = NULL;
+	unsigned int loops = 0;
 	int ret = 0;
-	unsigned int loops = 1;
 
 	if (WARN_ON_ONCE(ctx->submitter_task != current))
 		return -EEXIST;
-
-	node = io_llist_xchg(&ctx->work_llist, &fake);
+	if (ctx->flags & IORING_SETUP_TASKRUN_FLAG)
+		atomic_andnot(IORING_SQ_TASKRUN, &ctx->rings->sq_flags);
 again:
-	while (node != current_final) {
+	node = io_llist_xchg(&ctx->work_llist, NULL);
+	while (node) {
 		struct llist_node *next = node->next;
 		struct io_kiocb *req = container_of(node, struct io_kiocb,
 						    io_task_work.node);
@@ -1341,23 +1340,14 @@ static int __io_run_local_work(struct io_ring_ctx *ctx, bool *locked)
 		ret++;
 		node = next;
 	}
+	loops++;
 
-	if (ctx->flags & IORING_SETUP_TASKRUN_FLAG)
-		atomic_andnot(IORING_SQ_TASKRUN, &ctx->rings->sq_flags);
-
-	node = io_llist_cmpxchg(&ctx->work_llist, &fake, NULL);
-	if (node != &fake) {
-		loops++;
-		current_final = &fake;
-		node = io_llist_xchg(&ctx->work_llist, &fake);
+	if (!llist_empty(&ctx->work_llist))
 		goto again;
-	}
-
 	if (*locked)
 		io_submit_flush_completions(ctx);
 	trace_io_uring_local_work_run(ctx, ret, loops);
 	return ret;
-
 }
 
 static inline int io_run_local_work_locked(struct io_ring_ctx *ctx)
-- 
2.38.1




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux