Re: [PATCH 3/6] io_uring: add support for NO_OFFLOAD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/19/23 17:25, Jens Axboe wrote:
Some applications don't necessarily care about io_uring not blocking for
request issue, they simply want to use io_uring for batched submission
of IO. However, io_uring will always do non-blocking issues, and for
some request types, there's simply no support for doing non-blocking
issue and hence they get punted to io-wq unconditionally. If the
application doesn't care about issue potentially blocking, this causes
a performance slowdown as thread offload is not nearly as efficient as
inline issue.

Add support for configuring the ring with IORING_SETUP_NO_OFFLOAD, and
add an IORING_ENTER_NO_OFFLOAD flag to io_uring_enter(2). If either one
of these is set, then io_uring will ignore the non-block issue attempt
for any file which we cannot poll for readiness. The simplified io_uring
issue model looks as follows:

1) Non-blocking issue is attempted for IO. If successful, we're done for
    now.

2) Case 1 failed. Now we have two options
   	a) We can poll the file. We arm poll, and we're done for now
	   until that triggers.
    	b) File cannot be polled, we punt to io-wq which then does a
	   blocking attempt.

If either of the NO_OFFLOAD flags are set, we should never hit case
2b. Instead, case 1 would issue the IO without the non-blocking flag
being set and perform an inline completion.

Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
---
  include/linux/io_uring_types.h |  3 +++
  include/uapi/linux/io_uring.h  |  7 +++++++
  io_uring/io_uring.c            | 26 ++++++++++++++++++++------
  io_uring/io_uring.h            |  2 +-
  io_uring/sqpoll.c              |  3 ++-
  5 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 4dd54d2173e1..c54f3fb7ab1a 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -403,6 +403,7 @@ enum {
  	REQ_F_APOLL_MULTISHOT_BIT,
  	REQ_F_CLEAR_POLLIN_BIT,
  	REQ_F_HASH_LOCKED_BIT,
+	REQ_F_NO_OFFLOAD_BIT,
  	/* keep async read/write and isreg together and in order */
  	REQ_F_SUPPORT_NOWAIT_BIT,
  	REQ_F_ISREG_BIT,
@@ -475,6 +476,8 @@ enum {
  	REQ_F_CLEAR_POLLIN	= BIT_ULL(REQ_F_CLEAR_POLLIN_BIT),
  	/* hashed into ->cancel_hash_locked, protected by ->uring_lock */
  	REQ_F_HASH_LOCKED	= BIT_ULL(REQ_F_HASH_LOCKED_BIT),
+	/* don't offload to io-wq, issue blocking if needed */
+	REQ_F_NO_OFFLOAD	= BIT_ULL(REQ_F_NO_OFFLOAD_BIT),
  };
typedef void (*io_req_tw_func_t)(struct io_kiocb *req, struct io_tw_state *ts);
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 0716cb17e436..ea903a677ce9 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -173,6 +173,12 @@ enum {
   */
  #define IORING_SETUP_DEFER_TASKRUN	(1U << 13)
+/*
+ * Don't attempt non-blocking issue on file types that would otherwise
+ * punt to io-wq if they cannot be completed non-blocking.
+ */
+#define IORING_SETUP_NO_OFFLOAD		(1U << 14)
+
  enum io_uring_op {
  	IORING_OP_NOP,
  	IORING_OP_READV,
@@ -443,6 +449,7 @@ struct io_cqring_offsets {
  #define IORING_ENTER_SQ_WAIT		(1U << 2)
  #define IORING_ENTER_EXT_ARG		(1U << 3)
  #define IORING_ENTER_REGISTERED_RING	(1U << 4)
+#define IORING_ENTER_NO_OFFLOAD		(1U << 5)
/*
   * Passed in for io_uring_setup(2). Copied back with updated info on success
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 9568b5e4cf87..04770b06de16 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1947,6 +1947,10 @@ static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
  	if (unlikely(!io_assign_file(req, def, issue_flags)))
  		return -EBADF;
+ if (req->flags & REQ_F_NO_OFFLOAD &&
+	    (!req->file || !file_can_poll(req->file)))
+		issue_flags &= ~IO_URING_F_NONBLOCK;
+
  	if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred()))
  		creds = override_creds(req->creds);
@@ -2337,7 +2341,7 @@ static __cold int io_submit_fail_init(const struct io_uring_sqe *sqe,
  }
static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
-			 const struct io_uring_sqe *sqe)
+			 const struct io_uring_sqe *sqe, bool no_offload)
  	__must_hold(&ctx->uring_lock)
  {
  	struct io_submit_link *link = &ctx->submit_state.link;
@@ -2385,6 +2389,9 @@ static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
  		return 0;
  	}
+ if (no_offload)
+		req->flags |= REQ_F_NO_OFFLOAD;

Shouldn't it be a part of the initial "in syscall" submission
but not extended to tw? I'd say it should, otherwise it risks
making !DEFER_TASKRUN totally unpredictable. E.g. any syscall
can try to execute tw and get stuck waiting in there.

--
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux