When sqthread_poll is specified for io_uring and io_uring_cmd I/O engines, fio doesn't report submission latency and the completion latency is too big. Latency data before: fio --name=test --size=1M --rw=randread --ioengine=io_uring --sqthread_poll=1 clat (msec): min=1120.1k, max=1120.1k, avg=1120092.65, stdev= 8.32 lat (usec): min=104, max=5312, avg=132.81, stdev=325.05 clat percentiles (msec): | 1.00th=[17113], 5.00th=[17113], 10.00th=[17113], 20.00th=[17113], | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113], | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113], | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113], | 99.99th=[17113] lat (msec) : >=2000=100.00% As kernel polling thread handles the submission, there is no way to know when the actual submission happened. We can only rely on the commit hook and measure the issue time. Latency data after the change: fio --name=test --size=1M --rw=randread --ioengine=io_uring --sqthread_poll=1 slat (nsec): min=50, max=2230, avg=146.68, stdev=138.08 clat (usec): min=105, max=5151, avg=132.98, stdev=314.89 lat (usec): min=105, max=5153, avg=133.13, stdev=315.03 clat percentiles (usec): | 1.00th=[ 106], 5.00th=[ 108], 10.00th=[ 109], 20.00th=[ 110], | 30.00th=[ 111], 40.00th=[ 113], 50.00th=[ 114], 60.00th=[ 115], | 70.00th=[ 117], 80.00th=[ 118], 90.00th=[ 119], 95.00th=[ 121], | 99.00th=[ 123], 99.50th=[ 123], 99.90th=[ 5145], 99.95th=[ 5145], | 99.99th=[ 5145] lat (usec) : 250=99.61% lat (msec) : 10=0.39% This fixes the issue: https://github.com/axboe/fio/issues/1484 Ankit Kumar (2): engines:io_uring: slat and clat calculation with sqthread_poll doc: update about sqthread_poll HOWTO.rst | 4 +++- engines/io_uring.c | 4 ++++ fio.1 | 4 +++- 3 files changed, 10 insertions(+), 2 deletions(-) -- 2.17.1