Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/14/21 09:53, Hao Xu wrote:
在 2021/10/12 下午7:39, Pavel Begunkov 写道:
On 10/11/21 04:08, Hao Xu wrote:
在 2021/10/9 下午8:51, Pavel Begunkov 写道:
On 10/8/21 13:36, Hao Xu wrote:
this is a new feature for pollable requests, see detail in commit
message.

It really sounds we should do it as a part of IOSQE_ASYNC, so
what are the cons and compromises?
I wrote the pros and cons here:
https://github.com/axboe/liburing/issues/426#issuecomment-939221300

I see. The problem is as always, adding extra knobs, which users
should tune and it's not exactly clear where to use what. Not specific
to the new flag, there is enough confusion around IOSQE_ASYNC, but it
only makes it worse. It would be nice to have it applied
"automatically".

Say, with IOSQE_ASYNC the copy is always (almost) done by io-wq but
there is that polling optimisation on top. Do we care enough about
copying specifically in task context to have a different flag?

I did more tests in a 64 cores machine.
test command is: nice -n -15 taskset -c 10-20 ./io_uring_echo_server -p 8084 -f -n con_nr -l 1024
where -n means the number of connections, -l means size of load.
the results of tps and cpu usage under different IO pressure is:
(didn't find the way to make it look better, you may need a markdown
renderer :) )
tps

| feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 |
| -------- | -------- | -------- | -------- | -------- | -------- | -------- |
| ASYNC     |  123.000    |  295.203    |  67390.432   | 132686.361   | 186084.114   | 319550.540    |
| ASYNC_HYBRID     |   122.000   |  299.401    |  168321.092   | 188870.283  | 226427.166   |  371580.062   |


cpu

| feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 |
| -------- | -------- | -------- | -------- | -------- | -------- | -------- |
| ASYNC     |  0.3%    |   1.0%   |   62.5%  |  111.3%  |  198.3%  | 420.9%   |
| ASYNC_HYBRID     |    0.3%  |   1.0%   |  360%   |  435.5%  |  516.6%  |   1100%  |

when con_nr is 1000 or more, we leveraged all the 10 cpus. hybrid is
surely better than async. when con_nr is 1 or 2, in theory async should
be better since it use more cpu resource, but it didn't, it is because
the copying in tw is not a bottleneck. So I tried bigger workload, no
difference. So I think it should be ok to just apply this feature on top
of IOSQE_ASYNC, for all pollable requests in all condition.

Sounds great. And IOSQE_ASYNC is a hint flag, so if things change
we can return it back the behaviour of IOSQE_ASYNC and add that new
flag (or do something else).


--
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux