When sqthread_poll is specified for io_uring and io_uring_cmd I/O engines, fio doesn't report submission latency and the completion latency is too big. Latency data before: fio --name=test --size=1M --rw=randread --ioengine=io_uring --sqthread_poll=1 clat (msec): min=1120.1k, max=1120.1k, avg=1120092.65, stdev= 8.32 lat (usec): min=104, max=5312, avg=132.81, stdev=325.05 clat percentiles (msec): | 1.00th=[17113], 5.00th=[17113], 10.00th=[17113], 20.00th=[17113], | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113], | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113], | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113], | 99.99th=[17113] lat (msec) : >=2000=100.00% As kernel polling thread handles the submission, there is no way to know when the actual submission happened. We can only rely on the commit hook and measure the issue time. Latency data after the change: fio --name=test --size=1M --rw=randread --ioengine=io_uring --sqthread_poll=1 slat (nsec): min=50, max=2230, avg=146.68, stdev=138.08 clat (usec): min=105, max=5151, avg=132.98, stdev=314.89 lat (usec): min=105, max=5153, avg=133.13, stdev=315.03 clat percentiles (usec): | 1.00th=[ 106], 5.00th=[ 108], 10.00th=[ 109], 20.00th=[ 110], | 30.00th=[ 111], 40.00th=[ 113], 50.00th=[ 114], 60.00th=[ 115], | 70.00th=[ 117], 80.00th=[ 118], 90.00th=[ 119], 95.00th=[ 121], | 99.00th=[ 123], 99.50th=[ 123], 99.90th=[ 5145], 99.95th=[ 5145], | 99.99th=[ 5145] lat (usec) : 250=99.61% lat (msec) : 10=0.39% Signed-off-by: Ankit Kumar <ankit.kumar@xxxxxxxxxxx> --- engines/io_uring.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/engines/io_uring.c b/engines/io_uring.c index 6906e0a4..0d18fd4a 100644 --- a/engines/io_uring.c +++ b/engines/io_uring.c @@ -637,12 +637,16 @@ static int fio_ioring_commit(struct thread_data *td) */ if (o->sqpoll_thread) { struct io_sq_ring *ring = &ld->sq_ring; + unsigned start = *ld->sq_ring.head; unsigned flags; flags = atomic_load_acquire(ring->flags); if (flags & IORING_SQ_NEED_WAKEUP) io_uring_enter(ld, ld->queued, 0, IORING_ENTER_SQ_WAKEUP); + fio_ioring_queued(td, start, ld->queued); + io_u_mark_submit(td, ld->queued); + ld->queued = 0; return 0; } -- 2.17.1