Re: [PATCH v5 0/6] task work optimization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



在 2021/11/25 上午5:41, Pavel Begunkov 写道:
On 11/24/21 12:21, Hao Xu wrote:
v4->v5
- change the implementation of merge_wq_list

They only concern I had was about 6/6 not using inline completion
infra, when it's faster to grab ->uring_lock. i.e.
io_submit_flush_completions(), which should be faster when batching
is good.

Looking again through the code, the only user is SQPOLL

io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));

And with SQPOLL the lock is mostly grabbed by the SQPOLL task only,
IOW for pure block rw there shouldn't be any contention.
There still could be other type of task work, like async buffered reads.
I considered generic situation where different kinds of task works mixed
in the task list, then the inline completion infra always handle the
completions at the end, while in this new batching, we first handle the
completions and commit_cqring then do other task works.
Btw, I'm not sure the inline completion infra is faster than this
batching in pure rw completion(where all the task works are completion)
case, from the code, seems they are similar. Any hints about this?

Regards,
Hao
Doesn't make much sense, what am I missing?
How many requests are completed on average per tctx_task_work()?


It doesn't apply to for-5.17/io_uring, here is a rebase:
https://github.com/isilence/linux.git haoxu_tw_opt
link: https://github.com/isilence/linux/tree/haoxu_tw_opt

With that first 5 patches look good, so for them:
Reviewed-by: Pavel Begunkov <asml.silence@xxxxxxxxx>

but I still don't understand how 6/6 is better. Can it be because of
indirect branching? E.g. would something like this give the result?

- req->io_task_work.func(req, locked);
+ INDIRECT_CALL_1(req->io_task_work.func, io_req_task_complete, req, locked);


Hao Xu (6):
   io-wq: add helper to merge two wq_lists
   io_uring: add a priority tw list for irq completion work
   io_uring: add helper for task work execution code
   io_uring: split io_req_complete_post() and add a helper
   io_uring: move up io_put_kbuf() and io_put_rw_kbuf()
   io_uring: batch completion in prior_task_list

  fs/io-wq.h    |  22 +++++++
  fs/io_uring.c | 158 +++++++++++++++++++++++++++++++++-----------------
  2 files changed, 128 insertions(+), 52 deletions(-)






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux