On 9/22/21 1:34 PM, Xiaoguang Wang wrote: > For an echo-server based on io_uring's IORING_POLL_ADD_MULTI feature, > only poll request are completed in task work, normal read/write > requests are issued when user app sees cqes on corresponding poll > requests, and they will mostly read/write data successfully, which > don't need task work. So at least for echo-server model, batching > poll request completion properly will give benefits. > > Currently don't find any appropriate place to store batched poll > requests, put them in struct io_submit_state temporarily, which I > think it'll need rework in future. > > Signed-off-by: Xiaoguang Wang <xiaoguang.wang@xxxxxxxxxxxxxxxxx> > --- > fs/io_uring.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 63 insertions(+), 1 deletion(-) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 6fdfb688cf91..14118388bfc6 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -321,6 +321,11 @@ struct io_submit_state { > */ > struct io_kiocb *compl_reqs[IO_COMPL_BATCH]; > unsigned int compl_nr; > + > + struct io_kiocb *poll_compl_reqs[IO_COMPL_BATCH]; > + bool poll_req_status[IO_COMPL_BATCH]; > + unsigned int poll_compl_nr; > + > /* inline/task_work completion list, under ->uring_lock */ > struct list_head free_list; > > @@ -2093,6 +2098,8 @@ static void ctx_flush_and_put(struct io_ring_ctx *ctx, bool *locked) > percpu_ref_put(&ctx->refs); > } > > +static void io_poll_flush_completions(struct io_ring_ctx *ctx, bool *locked); > + > static void tctx_task_work(struct callback_head *cb) > { > bool locked = false; > @@ -2103,8 +2110,11 @@ static void tctx_task_work(struct callback_head *cb) > while (1) { > struct io_wq_work_node *node; > > - if (!tctx->task_list.first && locked && ctx->submit_state.compl_nr) > + if (!tctx->task_list.first && locked && (ctx->submit_state.compl_nr || > + ctx->submit_state.poll_compl_nr)) { io_submit_flush_completions() shouldn't be called if there are no requests... And the check is already inside for-next, will be if (... && locked) { io_submit_flush_completions(); if (poll_compl_nr) io_poll_flush_completions(); } > io_submit_flush_completions(ctx); > + io_poll_flush_completions(ctx, &locked); > + } > > spin_lock_irq(&tctx->task_lock); > node = tctx->task_list.first; > @@ -5134,6 +5144,49 @@ static inline bool io_poll_complete(struct io_kiocb *req, __poll_t mask) > static bool __io_poll_remove_one(struct io_kiocb *req, > struct io_poll_iocb *poll, bool do_cancel); > > +static void io_poll_flush_completions(struct io_ring_ctx *ctx, bool *locked) > + __must_hold(&ctx->uring_lock) > +{ > + struct io_submit_state *state = &ctx->submit_state; > + struct io_kiocb *req, *nxt; > + int i, nr = state->poll_compl_nr; > + bool done, skip_done = true; > + > + spin_lock(&ctx->completion_lock); > + for (i = 0; i < nr; i++) { > + req = state->poll_compl_reqs[i]; > + done = __io_poll_complete(req, req->result); I believe we first need to fix all the poll problems and lay out something more intuitive than the current implementation, or it'd be pure hell to do afterwards. Can be a nice addition, curious about numbers as well. -- Pavel Begunkov