在 2021/11/25 上午5:41, Pavel Begunkov 写道:
On 11/24/21 12:21, Hao Xu wrote:
v4->v5
- change the implementation of merge_wq_list
They only concern I had was about 6/6 not using inline completion
infra, when it's faster to grab ->uring_lock. i.e.
io_submit_flush_completions(), which should be faster when batching
is good.
Looking again through the code, the only user is SQPOLL
io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
And with SQPOLL the lock is mostly grabbed by the SQPOLL task only,
IOW for pure block rw there shouldn't be any contention.
There still could be other type of task work, like async buffered reads.
I considered generic situation where different kinds of task works mixed
in the task list, then the inline completion infra always handle the
completions at the end, while in this new batching, we first handle the
completions and commit_cqring then do other task works.
Btw, I'm not sure the inline completion infra is faster than this
batching in pure rw completion(where all the task works are completion)
case, from the code, seems they are similar. Any hints about this?
Regards,
Hao
Doesn't make much sense, what am I missing?
How many requests are completed on average per tctx_task_work()?
It doesn't apply to for-5.17/io_uring, here is a rebase:
https://github.com/isilence/linux.git haoxu_tw_opt
link: https://github.com/isilence/linux/tree/haoxu_tw_opt
With that first 5 patches look good, so for them:
Reviewed-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
but I still don't understand how 6/6 is better. Can it be because of
indirect branching? E.g. would something like this give the result?
- req->io_task_work.func(req, locked);
+ INDIRECT_CALL_1(req->io_task_work.func, io_req_task_complete, req,
locked);
Hao Xu (6):
io-wq: add helper to merge two wq_lists
io_uring: add a priority tw list for irq completion work
io_uring: add helper for task work execution code
io_uring: split io_req_complete_post() and add a helper
io_uring: move up io_put_kbuf() and io_put_rw_kbuf()
io_uring: batch completion in prior_task_list
fs/io-wq.h | 22 +++++++
fs/io_uring.c | 158 +++++++++++++++++++++++++++++++++-----------------
2 files changed, 128 insertions(+), 52 deletions(-)