Re: [PATCH v6 RESEND] io_uring: releasing CPU resources when polling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/9/24 10:29, hexue wrote:
io_uring use polling mode could improve the IO performence, but it will
spend 100% of CPU resources to do polling.

This set a signal "IORING_SETUP_HY_POLL" to application, aim to provide
a interface for user to enable a new hybrid polling at io_uring level.

A new hybrid poll is implemented on the io_uring layer. Once IO issued,
it will not polling immediately, but block first and re-run before IO
complete, then poll to reap IO. This poll function could be a suboptimal
solution when running on a single thread, it offers the performance lower
than regular polling but higher than IRQ, and CPU utilization is also lower
than polling.

Test Result
fio-3.35, Gen 4 device
-------------------------------------------------------------------------------------
Performance
-------------------------------------------------------------------------------------
                   write          read           randwrite       randread
regular poll    BW=3939MiB/s    BW=6596MiB/s    IOPS=190K       IOPS=526K
IRQ             BW=3927MiB/s    BW=6567MiB/s    IOPS=181K       IOPS=216K
hybrid poll     BW=3933MiB/s    BW=6600MiB/s    IOPS=190K       IOPS=390K(suboptimal)
-------------------------------------------------------------------------------------
CPU Utilization
-------------------------------------------------------------------------------------
                 write   read    randwrite       randread
regular poll    100%    100%    100%            100%
IRQ             38%     53%     100%            100%
hybrid poll     76%     32%     70%              85%
-------------------------------------------------------------------------------------

--
...
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 1a2128459cb4..5505f4292ce5 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
+static int io_uring_hybrid_poll(struct io_kiocb *req,
+				struct io_comp_batch *iob, unsigned int poll_flags)
+{
+	struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
+	struct io_ring_ctx *ctx = req->ctx;
+	int ret;
+	u64 runtime, sleep_time;
+
+	sleep_time = io_delay(ctx, req);
+
+	/* it doesn't implement with io_uring passthrough now */
+	ret = req->file->f_op->iopoll(&rw->kiocb, iob, poll_flags);

->iopoll vs ->uring_cmd_iopoll, same comment as in my
previous review


+
+	req->iopoll_end = ktime_get_ns();
+	runtime = req->iopoll_end - req->iopoll_start - sleep_time;
+	if (runtime < 0)
+		return 0;
+
+	/* use minimize sleep time if there are different speed
+	 * drivers, it could get more completions from fast one
+	 */
+	if (ctx->available_time > runtime)
+		ctx->available_time = runtime;
+	return ret;
+}
+
  int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
  {
  	struct io_wq_work_node *pos, *start, *prev;
@@ -1133,7 +1203,9 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
  		if (READ_ONCE(req->iopoll_completed))
  			break;
- if (req->opcode == IORING_OP_URING_CMD) {
+		if (ctx->flags & IORING_SETUP_HY_POLL) {
+			ret = io_uring_hybrid_poll(req, &iob, poll_flags);
+		} else if (req->opcode == IORING_OP_URING_CMD) {
  			struct io_uring_cmd *ioucmd;
ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);

--
Pavel Begunkov




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux