If task_state is cleared, io_req_task_work_add() will go the slow path adding a task_work, setting the task_state, waking up the task and so on. Not to mention it's expensive. tctx_task_work() first clears the state and then executes all the work items queued, so if any of them resubmits or adds new task_work items, it would unnecessarily go through the slow path of io_req_task_work_add(). Let's clear the ->task_state at the end. We still have to check ->task_list for emptiness afterward to synchronise with io_req_task_work_add(), do that, and set the state back if we're going to retry, because clearing not-ours task_state on the next iteration would be buggy. Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> --- fs/io_uring.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 2fdca298e173..4353f64c10c4 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1894,8 +1894,6 @@ static void tctx_task_work(struct callback_head *cb) struct io_uring_task *tctx = container_of(cb, struct io_uring_task, task_work); - clear_bit(0, &tctx->task_state); - while (1) { struct io_wq_work_node *node; @@ -1917,8 +1915,14 @@ static void tctx_task_work(struct callback_head *cb) req->task_work.func(&req->task_work); node = next; } - if (wq_list_empty(&tctx->task_list)) - break; + if (wq_list_empty(&tctx->task_list)) { + clear_bit(0, &tctx->task_state); + if (wq_list_empty(&tctx->task_list)) + break; + /* another tctx_task_work() is enqueued, yield */ + if (test_and_set_bit(0, &tctx->task_state)) + break; + } cond_resched(); } -- 2.31.1