On 12/7/21 09:39, Hao Xu wrote:
In previous patches, we have already gathered some tw with
io_req_task_complete() as callback in prior_task_list, let's complete
them in batch while we cannot grab uring lock. In this way, we batch
the req_complete_post path.
Tested-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
Hao, please never add tags for other people unless they confirmed
that it's fine. I asked Jens to kill this one and my signed-off
from 4/5 from io_uring branches.
Signed-off-by: Hao Xu <haoxu@xxxxxxxxxxxxxxxxx>
---
Hi Pavel,
May I add the above Test-by tag here?
fs/io_uring.c | 70 +++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 60 insertions(+), 10 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 21738ed7521e..f224f8df77a1 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2225,6 +2225,49 @@ static void ctx_flush_and_put(struct io_ring_ctx *ctx, bool *locked)
percpu_ref_put(&ctx->refs);
}
+static inline void ctx_commit_and_unlock(struct io_ring_ctx *ctx)
+{
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ io_cqring_ev_posted(ctx);
+}
+
+static void handle_prior_tw_list(struct io_wq_work_node *node, struct io_ring_ctx **ctx,
+ bool *uring_locked, bool *compl_locked)
compl_locked probably can be a local var here, you're clearing it at the
end of the function anyway.
+{
+ do {
+ struct io_wq_work_node *next = node->next;
+ struct io_kiocb *req = container_of(node, struct io_kiocb,
+ io_task_work.node);
+
+ if (req->ctx != *ctx) {
+ if (unlikely(*compl_locked)) {
+ ctx_commit_and_unlock(*ctx);
+ *compl_locked = false;
+ }
+ ctx_flush_and_put(*ctx, uring_locked);
+ *ctx = req->ctx;
+ /* if not contended, grab and improve batching */
+ *uring_locked = mutex_trylock(&(*ctx)->uring_lock);
+ percpu_ref_get(&(*ctx)->refs);
+ if (unlikely(!*uring_locked)) {
+ spin_lock(&(*ctx)->completion_lock);
+ *compl_locked = true;
+ }
+ }
+ if (likely(*uring_locked))
+ req->io_task_work.func(req, uring_locked);
+ else
+ __io_req_complete_post(req, req->result, io_put_kbuf(req));
I think there is the same issue as last time, first iteration of tctx_task_work()
sets ctx but doesn't get uring_lock. Then you go here, find a request with the
same ctx and end up here with locking.
+ node = next;
+ } while (node);
+
+ if (unlikely(*compl_locked)) {
+ ctx_commit_and_unlock(*ctx);
+ *compl_locked = false;
+ }
+}
--
Pavel Begunkov