Re: [RFC PATCH 2/2] io_uring/io-wq: try to batch multiple free work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/21/25 14:52, Bui Quang Minh wrote:
On 2/21/25 19:44, Pavel Begunkov wrote:
On 2/21/25 04:19, Bui Quang Minh wrote:
Currently, in case we don't use IORING_SETUP_DEFER_TASKRUN, when io
worker frees work, it needs to add a task work. This creates contention
on tctx->task_list. With this commit, io work queues free work on a
local list and batch multiple free work in one call when the number of
free work in local list exceeds IO_REQ_ALLOC_BATCH.

I see no relation to IO_REQ_ALLOC_BATCH, that should be
a separate macro.

Signed-off-by: Bui Quang Minh <minhquangbui99@xxxxxxxxx>
---
  io_uring/io-wq.c    | 62 +++++++++++++++++++++++++++++++++++++++++++--
  io_uring/io-wq.h    |  4 ++-
  io_uring/io_uring.c | 23 ++++++++++++++---
  io_uring/io_uring.h |  6 ++++-
  4 files changed, 87 insertions(+), 8 deletions(-)

diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index 5d0928f37471..096711707db9 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
...
@@ -601,7 +622,41 @@ static void io_worker_handle_work(struct io_wq_acct *acct,
              wq->do_work(work);
              io_assign_current_work(worker, NULL);
  -            linked = wq->free_work(work);
+            /*
+             * All requests in free list must have the same
+             * io_ring_ctx.
+             */
+            if (last_added_ctx && last_added_ctx != req->ctx) {
+                flush_req_free_list(&free_list, tail);
+                tail = NULL;
+                last_added_ctx = NULL;
+                free_req = 0;
+            }
+
+            /*
+             * Try to batch free work when
+             * !IORING_SETUP_DEFER_TASKRUN to reduce contention
+             * on tctx->task_list.
+             */
+            if (req->ctx->flags & IORING_SETUP_DEFER_TASKRUN)
+                linked = wq->free_work(work, NULL, NULL);
+            else
+                linked = wq->free_work(work, &free_list, &did_free);

The problem here is that iowq is blocking and hence you lock up resources
of already completed request for who knows how long. In case of unbound
requests (see IO_WQ_ACCT_UNBOUND) it's indefinite, and it's absolutely
cannot be used without some kind of a timer. But even in case of bound
work, it can be pretty long.
That's a good point, I've overlooked the fact that work handler might block indefinitely.
Maybe, for bound requests it can target N like here, but read jiffies
in between each request and flush if it has been too long. So in worst
case the total delay is the last req execution time + DT. But even then
it feels wrong, especially with filesystems sometimes not even
honouring NOWAIT.

The question is, why do you force it into the worker pool with the
IOSQE_ASYNC flag? It's generally not recommended, and the name of the
flag is confusing as it should've been more like "WORKER_OFFLOAD".


I launched more workers to parallel the work handler, but as you said, it seems like an incorrect use case.

Not as much incorrect as generally inefficient and not recommended,
especially not recommended before trying it without the flag. People
often fall into the trap of "Oh, it's _async_, surely I have to set it",
really unfortunate naming.

However, I think the request free seems heavy, we need to create a task work so that we can hold the uring_lock to queue the request to ctx->submit_state->compl_reqs. Let me play around more to see if I can find an optimization for this.

That's because it's a slow fallback path for cases that can't do
async for one reason or another, and ideally we wouldn't have it
at all. In reality it's used more than I'd wish for, but it's
still a path we don't heavily optimise.

Btw, if you're really spamming iowq threads, I'm surprised that's
the only hotspot you have. There should be some contention for
CQE posting (->completion_lock), and probably in the iowq queue
locking, and so on.

--
Pavel Begunkov





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux