Am 21.10.22 um 11:27 schrieb Pavel Begunkov:
On 10/21/22 09:32, Stefan Metzmacher wrote:
Hi Pavel,
Experimenting with this stuff lets me wish to have a way to
have a different 'user_data' field for the notif cqe,
maybe based on a IORING_RECVSEND_ flag, it may make my life
easier and would avoid some complexity in userspace...
As I need to handle retry on short writes even with MSG_WAITALL
as EINTR and other errors could cause them.
What do you think?
Any comment on this?
IORING_SEND_NOTIF_USER_DATA could let us use
notif->cqe.user_data = sqe->addr3;
I'd rather not use the last available u64, tbh, that was the
reason for not adding a second user_data in the first place.
As far as I can see io_send_zc_prep has this:
if (unlikely(READ_ONCE(sqe->__pad2[0]) || READ_ONCE(sqe->addr3)))
return -EINVAL;
both are u64...
Hah, true, completely forgot about that one
So would a commit like below be fine for you?
Do you have anything in mind for SEND[MSG]_ZC that could possibly use
another u64 in future?
metze
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 738d6234d1d9..7a6272872334 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -300,6 +300,7 @@ enum io_uring_op {
#define IORING_RECVSEND_POLL_FIRST (1U << 0)
#define IORING_RECV_MULTISHOT (1U << 1)
#define IORING_RECVSEND_FIXED_BUF (1U << 2)
+#define IORING_SEND_NOTIF_USER_DATA (1U << 3)
/*
* accept flags stored in sqe->ioprio
diff --git a/io_uring/net.c b/io_uring/net.c
index 735eec545115..e1bc06b58cd7 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -938,7 +938,7 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
struct io_ring_ctx *ctx = req->ctx;
struct io_kiocb *notif;
- if (unlikely(READ_ONCE(sqe->__pad2[0]) || READ_ONCE(sqe->addr3)))
+ if (unlikely(READ_ONCE(sqe->__pad2[0]))
return -EINVAL;
/* we don't support IOSQE_CQE_SKIP_SUCCESS just yet */
if (req->flags & REQ_F_CQE_SKIP)
@@ -946,12 +946,19 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
zc->flags = READ_ONCE(sqe->ioprio);
if (zc->flags & ~(IORING_RECVSEND_POLL_FIRST |
- IORING_RECVSEND_FIXED_BUF))
+ IORING_RECVSEND_FIXED_BUF |
+ IORING_SEND_NOTIF_USER_DATA))
return -EINVAL;
notif = zc->notif = io_alloc_notif(ctx);
if (!notif)
return -ENOMEM;
- notif->cqe.user_data = req->cqe.user_data;
+ if (zc->flags & IORING_SEND_NOTIF_USER_DATA)
+ notif->cqe.user_data = READ_ONCE(sqe->addr3);
+ else {
+ if (unlikely(READ_ONCE(sqe->addr3)))
+ return -EINVAL;
+ notif->cqe.user_data = req->cqe.user_data;
+ }
notif->cqe.res = 0;
notif->cqe.flags = IORING_CQE_F_NOTIF;
req->flags |= REQ_F_NEED_CLEANUP;