Re: [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests

Hao Xu <haoxu@xxxxxxxxxxxxxxxxx> · Mon, 18 Oct 2021 20:20:13 +0800

在 2021/10/18 下午8:10, Pavel Begunkov 写道:
On 10/18/21 11:34, Hao Xu wrote:
在 2021/10/18 下午7:29, Hao Xu 写道:
The current logic of requests with IOSQE_ASYNC is first queueing it to
io-worker, then execute it in a synchronous way. For unbound works like
pollable requests(e.g. read/write a socketfd), the io-worker may stuck
there waiting for events for a long time. And thus other works wait in
the list for a long time too.
Let's introduce a new way for unbound works (currently pollable
requests), with this a request will first be queued to io-worker, then
executed in a nonblock try rather than a synchronous way. Failure of
that leads it to arm poll stuff and then the worker can begin to handle
other works.
The detail process of this kind of requests is:

step1: original context:
            queue it to io-worker
step2: io-worker context:
            nonblock try(the old logic is a synchronous try here)
                |
                |--fail--> arm poll
                             |
                             |--(fail/ready)-->synchronous issue
                             |
                             |--(succeed)-->worker finish it's job, tw
                                            take over the req

This works much better than the old IOSQE_ASYNC logic in cases where
unbound max_worker is relatively small. In this case, number of
io-worker eazily increments to max_worker, new worker cannot be created
and running workers stuck there handling old works in IOSQE_ASYNC mode.

In my 64-core machine, set unbound max_worker to 20, run echo-server,
turns out:
(arguments: register_file, connetion number is 1000, message size is 12
Byte)
original IOSQE_ASYNC: 76664.151 tps
after this patch: 166934.985 tps

Suggested-by: Jens Axboe <axboe@xxxxxxxxx>
Signed-off-by: Hao Xu <haoxu@xxxxxxxxxxxxxxxxx>
An irrelevant question: why do we do linked timeout logic in
io_wq_submit_work() again regarding that we've already done it in
io_queue_async_work().

Because io_wq_free_work() may enqueue new work (by returning it)
without going through io_queue_async_work(), and we don't care
enough to split those cases.
Make sense. Thanks.