Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



在 2021/10/12 下午7:39, Pavel Begunkov 写道:
On 10/11/21 04:08, Hao Xu wrote:
在 2021/10/9 下午8:51, Pavel Begunkov 写道:
On 10/8/21 13:36, Hao Xu wrote:
this is a new feature for pollable requests, see detail in commit
message.

It really sounds we should do it as a part of IOSQE_ASYNC, so
what are the cons and compromises?
I wrote the pros and cons here:
https://github.com/axboe/liburing/issues/426#issuecomment-939221300

I see. The problem is as always, adding extra knobs, which users
should tune and it's not exactly clear where to use what. Not specific
to the new flag, there is enough confusion around IOSQE_ASYNC, but it
only makes it worse. It would be nice to have it applied
"automatically".

Say, with IOSQE_ASYNC the copy is always (almost) done by io-wq but
there is that polling optimisation on top. Do we care enough about
copying specifically in task context to have a different flag?

I did more tests in a 64 cores machine.
test command is: nice -n -15 taskset -c 10-20 ./io_uring_echo_server -p 8084 -f -n con_nr -l 1024
where -n means the number of connections, -l means size of load.
the results of tps and cpu usage under different IO pressure is:
(didn't find the way to make it look better, you may need a markdown
renderer :) )
tps

| feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 |
| -------- | -------- | -------- | -------- | -------- | -------- | -------- | | ASYNC | 123.000 | 295.203 | 67390.432 | 132686.361 | 186084.114 | 319550.540 | | ASYNC_HYBRID | 122.000 | 299.401 | 168321.092 | 188870.283 | 226427.166 | 371580.062 |


cpu

| feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 |
| -------- | -------- | -------- | -------- | -------- | -------- | -------- | | ASYNC | 0.3% | 1.0% | 62.5% | 111.3% | 198.3% | 420.9% | | ASYNC_HYBRID | 0.3% | 1.0% | 360% | 435.5% | 516.6% | 1100% |

when con_nr is 1000 or more, we leveraged all the 10 cpus. hybrid is
surely better than async. when con_nr is 1 or 2, in theory async should
be better since it use more cpu resource, but it didn't, it is because
the copying in tw is not a bottleneck. So I tried bigger workload, no
difference. So I think it should be ok to just apply this feature on top
of IOSQE_ASYNC, for all pollable requests in all condition.

Regards,
Hao
a quick question, what is "tps" in "IOSQE_ASYNC: 76664.151 tps"?

Hao Xu (2):
   io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests

btw, it doesn't make sense to split it into two patches
Hmm, I thought we should make adding a new flag as a separate patch.
Could you give me more hints about the considerration here?

You can easily ignore it, just looked weird to me. Let's try to
phrase it:

1) 1/2 doesn't do anything useful w/o 2/2, iow it doesn't feel like
an atomic change. And it would be breaking the userspace, if it's
not just a hint flag.

2) it's harder to read, you search the git history, find the
implementation (and the flag is already there), you think what's
happening here, where the flag was used and so to find out that
it was added separately a commit ago.

3) sometimes it's done similarly because the API change is not
simple, but it's not the case here.
By similarly I mean the other way around, first implement it
internally, but not exposing any mean to use it, and adding
the userspace API in next commits.

   io_uring: implementation of IOSQE_ASYNC_HYBRID logic

  fs/io_uring.c                 | 48 +++++++++++++++++++++++++++++++----
  include/uapi/linux/io_uring.h |  4 ++-
  2 files changed, 46 insertions(+), 6 deletions(-)








[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux