On 19/04/2022 15.04, Jens Axboe wrote:
On 4/19/22 5:57 AM, Avi Kivity wrote:
On 19/04/2022 14.38, Jens Axboe wrote:
On 4/19/22 5:07 AM, Avi Kivity wrote:
A simple webserver shows about 5% loss compared to linux-aio.
I expect the loss is due to an optimization that io_uring lacks -
inline completion vs workqueue completion:
I don't think that's it, io_uring never punts to a workqueue for
completions.
I measured this:
Performance counter stats for 'system wide':
1,273,756 io_uring:io_uring_task_add
12.288597765 seconds time elapsed
Which exactly matches with the number of requests sent. If that's the
wrong counter to measure, I'm happy to try again with the correct
counter.
io_uring_task_add() isn't a workqueue, it's task_work. So that is
expected.
Ah, and it should be fine. I'll try 'perf diff' again (I ran it but
didn't reach any conclusive results and assumed non-systemwide runs
weren't measuring workqueues (and systemwide runs generated too much
noise on my workstation)).
Do you have a test case of sorts?
Seastar's httpd, running on a single core, against wrk -c 1000 -t 4 http://localhost:10000/.
Instructions:
git clone --recursive -b io_uring https://github.com/avikivity/seastar
cd seastar
sudo ./install-dependencies.sh # after carefully verifying it, of course
./configure.py --mode release
ninja -C build/release apps/httpd/httpd
./build/release/apps/httpd/httpd --smp 1 [--reactor-backing io_uring|linux-aio|epoll]
and run wrk againt it.
Thanks, I'll give that a spin!
Thanks. You may need ./configure --c++-dialect=c++17 if your C++
compiler is too old.
For a performance oriented network setup, I'd normally not consider data
readiness poll replacements to be that interesting, my recommendation
would be to use async send/recv for that instead. That's how io_uring is
supposed to be used, in a completion based model.
That's true. Still, an existing system that evolved around poll will
take some time and effort to migrate, and have slower IORING_OP_POLL
means it cannot benefit from io_uring's many other advantages if it
fears a regression from that difference.
I'd like to separate the two - should the OP_POLL work as well, most
certainly. Do I think it's largely a useless way to run it, also yes :-)
Agree.
Note that it's not just a matter of converting poll+recvmsg to
IORING_OP_RECVMSG. If you support many connections, one must migrate
to internal buffer selection, otherwise the memory load with a large
number of idle connections is high. The end result is wonderful but
the road there is long.
Totally agree. My point is just that to take full advantage of it, you
need to be using that kind of model and quick conversions aren't really
expected to yield much of a performance win. They are also not supposed
to run slower, so that does need some attention if that's the case here.
We're in agreement, but I'd like to clarify the quick conversion is
intended to win from other aspects of io_uring, with the deeper change
coming later.