[PATCH 4/5] io_uring/net: let io_sendzc set IORING_CQE_F_MORE before sock_sendmsg()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



sock_sendmsg() can take references to the passed buffers even on
failure!

So we need to make sure we'll set IORING_CQE_F_MORE before
calling sock_sendmsg().

As REQ_F_CQE_SKIP for notif and IORING_CQE_F_MORE for the main request
go hand in hand, lets simplify the REQ_F_CQE_SKIP logic too.

We just start with REQ_F_CQE_SKIP set and reset it when we
set IORING_CQE_F_MORE on the main request in order to have
the transition in one isolated place.

In future we might be able to revert IORING_CQE_F_MORE and
!REQ_F_CQE_SKIP again if we find out that no reference was
taken by the network layer. But that's a change for another day.
The important thing would just be that the documentation for
IORING_OP_SEND_ZC would indicate that the kernel may decide
to return just a single cqe without IORING_CQE_F_MORE, even
in the success case, so that userspace would not break when
we add such an optimization at a layer point.

Fixes: b48c312be05e8 ("io_uring/net: simplify zerocopy send user API")
Signed-off-by: Stefan Metzmacher <metze@xxxxxxxxx>
Cc: Pavel Begunkov <asml.silence@xxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxxxxx>
Cc: io-uring@xxxxxxxxxxxxxxx
---
 io_uring/net.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/io_uring/net.c b/io_uring/net.c
index e9efed40cf3d..61e6194b01b7 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -883,7 +883,6 @@ void io_sendzc_cleanup(struct io_kiocb *req)
 {
 	struct io_sendzc *zc = io_kiocb_to_cmd(req, struct io_sendzc);
 
-	zc->notif->flags |= REQ_F_CQE_SKIP;
 	io_notif_flush(zc->notif);
 	zc->notif = NULL;
 }
@@ -920,6 +919,8 @@ int io_sendzc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	notif->cqe.user_data = req->cqe.user_data;
 	notif->cqe.res = 0;
 	notif->cqe.flags = IORING_CQE_F_NOTIF;
+	/* skip the notif cqe until we call sock_sendmsg() */
+	notif->flags |= REQ_F_CQE_SKIP;
 	req->flags |= REQ_F_NEED_CLEANUP;
 
 	zc->buf = u64_to_user_ptr(READ_ONCE(sqe->addr));
@@ -1000,7 +1001,7 @@ int io_sendzc(struct io_kiocb *req, unsigned int issue_flags)
 	struct msghdr msg;
 	struct iovec iov;
 	struct socket *sock;
-	unsigned msg_flags, cflags;
+	unsigned msg_flags;
 	int ret, min_ret = 0;
 
 	sock = sock_from_file(req->file);
@@ -1055,6 +1056,15 @@ int io_sendzc(struct io_kiocb *req, unsigned int issue_flags)
 	msg.msg_flags = msg_flags;
 	msg.msg_ubuf = &io_notif_to_data(zc->notif)->uarg;
 	msg.sg_from_iter = io_sg_from_iter;
+
+	/*
+	 * Now that we call sock_sendmsg,
+	 * we need to assume that the data is referenced
+	 * even on failure!
+	 * So we need to force a NOTIF cqe
+	 */
+	zc->notif->flags &= ~REQ_F_CQE_SKIP;
+	req->cqe.flags |= IORING_CQE_F_MORE;
 	ret = sock_sendmsg(sock, &msg);
 
 	if (unlikely(ret < min_ret)) {
@@ -1068,8 +1078,6 @@ int io_sendzc(struct io_kiocb *req, unsigned int issue_flags)
 			req->flags |= REQ_F_PARTIAL_IO;
 			return io_setup_async_addr(req, addr, issue_flags);
 		}
-		if (ret < 0 && !zc->done_io)
-			zc->notif->flags |= REQ_F_CQE_SKIP;
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
 		req_set_fail(req);
@@ -1082,8 +1090,7 @@ int io_sendzc(struct io_kiocb *req, unsigned int issue_flags)
 
 	io_notif_flush(zc->notif);
 	req->flags &= ~REQ_F_NEED_CLEANUP;
-	cflags = ret >= 0 ? IORING_CQE_F_MORE : 0;
-	io_req_set_res(req, ret, cflags);
+	io_req_set_res(req, ret, req->cqe.flags);
 	return IOU_OK;
 }
 
-- 
2.34.1




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux