在 2021/11/25 下午11:27, Pavel Begunkov 写道:
On 11/25/21 11:37, Hao Xu wrote:
在 2021/11/25 上午5:41, Pavel Begunkov 写道:
On 11/24/21 12:21, Hao Xu wrote:
v4->v5
- change the implementation of merge_wq_list
They only concern I had was about 6/6 not using inline completion
infra, when it's faster to grab ->uring_lock. i.e.
io_submit_flush_completions(), which should be faster when batching
is good.
Looking again through the code, the only user is SQPOLL
io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
And with SQPOLL the lock is mostly grabbed by the SQPOLL task only,
IOW for pure block rw there shouldn't be any contention.
There still could be other type of task work, like async buffered reads.
I considered generic situation where different kinds of task works mixed
in the task list, then the inline completion infra always handle the
completions at the end, while in this new batching, we first handle the
completions and commit_cqring then do other task works.
I was talking about 6/6 in particular. The reordering (done by first
2 or 3 patches) sound plausible, but if compare say 1-5 vs same but
+ patch 6/6
Ah, sorry.. misremember the content of 6/6 and the previous ones.
Btw, I'm not sure the inline completion infra is faster than this
batching in pure rw completion(where all the task works are completion)
case, from the code, seems they are similar. Any hints about this?
Was looking through, and apparently I placed task_put optimisation
into io_req_complete_post() as well, see io_put_task().
pros of io_submit_flush_completions:
1) batched rsrc refs put
2) a bit better on assembly
3) shorter spin section (separate loop)
4) enqueueing right into ctx->submit_state.free_list, so no
1 io_flush_cached_reqs() per IO_COMPL_BATCH=32
pros of io_req_complete_post() path:
1) no uring_lock locking (not contended)
2) de-virtualisation
3) no extra (yet another) list traversal and io_req_complete_state()
So, with put_task optimised, indeed not so clear which would win > Did you use fixed rsrc for testing? (files or buffers)
No, I didn't. Let's first play it safe as you said:
if (locked) flush_completions
else new stuff