Re: [PATCH 1/2] engines:io_uring: slat and clat calculation with sqthread_poll

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/4/22 07:13, Ankit Kumar wrote:
When sqthread_poll is specified for io_uring and io_uring_cmd I/O engines,
fio doesn't report submission latency and the completion latency is too big.
Latency data before:

fio --name=test --size=1M --rw=randread --ioengine=io_uring --sqthread_poll=1
     clat (msec): min=1120.1k, max=1120.1k, avg=1120092.65, stdev= 8.32
      lat (usec): min=104, max=5312, avg=132.81, stdev=325.05
     clat percentiles (msec):
      |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
      | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
      | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
      | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
      | 99.99th=[17113]
   lat (msec)   : >=2000=100.00%

As kernel polling thread handles the submission, there is no way to know when
the actual submission happened. We can only rely on the commit hook and
measure the issue time.
Latency data after the change:

fio --name=test --size=1M --rw=randread --ioengine=io_uring --sqthread_poll=1
     slat (nsec): min=50, max=2230, avg=146.68, stdev=138.08
     clat (usec): min=105, max=5151, avg=132.98, stdev=314.89
      lat (usec): min=105, max=5153, avg=133.13, stdev=315.03
     clat percentiles (usec):
      |  1.00th=[  106],  5.00th=[  108], 10.00th=[  109], 20.00th=[  110],
      | 30.00th=[  111], 40.00th=[  113], 50.00th=[  114], 60.00th=[  115],
      | 70.00th=[  117], 80.00th=[  118], 90.00th=[  119], 95.00th=[  121],
      | 99.00th=[  123], 99.50th=[  123], 99.90th=[ 5145], 99.95th=[ 5145],
      | 99.99th=[ 5145]
   lat (usec)   : 250=99.61%
   lat (msec)   : 10=0.39%

Signed-off-by: Ankit Kumar <ankit.kumar@xxxxxxxxxxx>
---
  engines/io_uring.c | 4 ++++
  1 file changed, 4 insertions(+)

diff --git a/engines/io_uring.c b/engines/io_uring.c
index 6906e0a4..0d18fd4a 100644
--- a/engines/io_uring.c
+++ b/engines/io_uring.c
@@ -637,12 +637,16 @@ static int fio_ioring_commit(struct thread_data *td)
  	 */
  	if (o->sqpoll_thread) {
  		struct io_sq_ring *ring = &ld->sq_ring;
+		unsigned start = *ld->sq_ring.head;
  		unsigned flags;
flags = atomic_load_acquire(ring->flags);
  		if (flags & IORING_SQ_NEED_WAKEUP)
  			io_uring_enter(ld, ld->queued, 0,
  					IORING_ENTER_SQ_WAKEUP);
+		fio_ioring_queued(td, start, ld->queued);
+		io_u_mark_submit(td, ld->queued);
+
  		ld->queued = 0;
  		return 0;
  	}

Ankit, I think the important point here is to make sure that the reported slat and clat values when sqthread_poll=1 can be meaningfully compared to corresponding values when sqthread_poll=0.

That means we need to record issue_time at corresponding points in the submission process but it's not obvious to me where to record issue_time when sqthread_poll=1.

I can think of two reasonable solutions when sqthread_poll is enabled:

1) suppress slat and clat when sqthread_poll=1 because we don't have a good place to record issue_time

2) record issue_time and the end of fio_ioring_queue() when IORING_SQ_NEED_WAKEUP is not set and in commit() as you have above when it is flagged

What do you think?

Jens, do you have an opinion here?

Vincent



[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux