Hi Jens,
No problem - have you been able to test the current repo in general? I want to
cut a 2.3 release shortly, but since that particular change impacts any kind of
cqe waiting, would be nice to have a bit more confidence in it.
At least the timing bug is still fixed (as with my change).
I'm currently trying to prototype for an IORING_POLL_CANCEL_ON_CLOSE
flag that can be passed to POLL_ADD. With that we'll register
the request in &req->file->f_uring_poll (similar to the file->f_ep list for epoll)
Then we only get a real reference to the file during the call to
vfs_poll() otherwise we drop the fget/fput reference and rely on
an io_uring_poll_release_file() (similar to eventpoll_release_file())
to cancel our registered poll request.
Yes, this is a bit tricky as we hold the file ref across the operation. I'd
be interested in seeing your approach to this, and also how it would
interact with registered files...
Here's my current patch:
https://git.samba.org/?p=metze/linux/wip.git;a=commitdiff;h=b9cccfac515739fc279c6eec87ce655a96f94685
It compiles, but I haven't tested it yet. And I'm not sure if the locking is done correctly...
c) A simple pipe based performance test shows the following numbers:
- 'poll': Got 232387.31 pipe events/sec
- 'epoll': Got 251125.25 pipe events/sec
- 'samba_io_uring_ev': Got 210998.77 pipe events/sec
So the io_uring backend is even slower than the 'poll' backend.
I guess the reason is the constant re-submission of IORING_OP_POLL_ADD.
Added some feature autodetection today and I'm now using
IORING_SETUP_COOP_TASKRUN, IORING_SETUP_TASKRUN_FLAG,
IORING_SETUP_SINGLE_ISSUER and IORING_SETUP_DEFER_TASKRUN if supported
by the kernel.
On a 6.1 kernel this improved the performance a lot, it's now faster
than the epoll backend.
The key flag is IORING_SETUP_DEFER_TASKRUN. On a different system than above
I'm getting the following numbers:
- epoll: Got 114450.16 pipe events/sec
- poll: Got 105872.52 pipe events/sec
- samba_io_uring_ev-without-defer_taskrun': Got 95564.22 pipe events/sec
- samba_io_uring_ev-with-defer_taskrun': Got 122853.85 pipe events/sec
Any chance you can do a run with just IORING_SETUP_COOP_TASKRUN set? I'm
curious how big of an impact the IPI elimination is, where it slots in
compared to the defer taskrun and the default settings.
There's no real difference between these:
- no flag
- IORING_SETUP_COOP_TASKRUN|IORING_SETUP_TASKRUN_FLAG
- IORING_SETUP_SINGLE_ISSUER
- IORING_SETUP_COOP_TASKRUN|IORING_SETUP_TASKRUN_FLAG|IORING_SETUP_SINGLE_ISSUER
only these make it fast:
- IORING_SETUP_SINGLE_ISSUER|IORING_SETUP_DEFER_TASKRUN
- IORING_SETUP_COOP_TASKRUN|IORING_SETUP_TASKRUN_FLAG|IORING_SETUP_SINGLE_ISSUER|IORING_SETUP_DEFER_TASKRUN
metze