On 9/16/22 22:36, Stefan Metzmacher wrote:
Hi Pavel, hi Jens,
I did some initial testing with IORING_OP_SEND_ZC.
While reading the code I think I found a race that
can lead to IORING_CQE_F_MORE being missing even if
the net layer got references.
Hey Stefan,
Did you see some kind of buggy behaviour in userspace?
If network sends anything it should return how many bytes
it queued for sending, otherwise there would be duplicated
packets / data on the other endpoint in userspace, and I
don't think any driver / lower layer would keep memory
after returning an error.
In any case, I was looking on a bit different problem, but
it should look much cleaner using the same approach, see
branch [1], and patch [3] for sendzc in particular.
[1] https://github.com/isilence/linux.git partial-fail
[2] https://github.com/isilence/linux/tree/io_uring/partial-fail
[3] https://github.com/isilence/linux/commit/acb4f9bf869e1c2542849e11d992a63d95f2b894
While there I added some code to allow userpace to
know how effective the IORING_OP_SEND_ZC attempt was,
in order to avoid it it's not used (e.g. on a long living tcp
connection).>
This change requires a change to the existing test, see:
https://github.com/metze-samba/liburing/tree/test-send-zerocopy
Stefan Metzmacher (5):
io_uring/opdef: rename SENDZC_NOTIF to SEND_ZC
io_uring/core: move io_cqe->fd over from io_cqe->flags to io_cqe->res
io_uring/core: keep req->cqe.flags on generic errors
io_uring/net: let io_sendzc set IORING_CQE_F_MORE before
sock_sendmsg()
io_uring/notif: let userspace know how effective the zero copy usage
was
include/linux/io_uring_types.h | 6 +++---
io_uring/io_uring.c | 18 +++++++++++++-----
io_uring/net.c | 19 +++++++++++++------
io_uring/notif.c | 18 ++++++++++++++++++
io_uring/opdef.c | 2 +-
net/ipv4/ip_output.c | 3 ++-
net/ipv4/tcp.c | 2 ++
net/ipv6/ip6_output.c | 3 ++-
8 files changed, 54 insertions(+), 17 deletions(-)
--
Pavel Begunkov