> -----Original Message----- > From: Alex Matyushin [mailto:airgunster@xxxxxxxxxxxxx] > Sent: Tuesday, July 12, 2022 2:59 PM > To: fio@xxxxxxxxxxxxxxx > Subject: fdatasync latency percentiles > > Hi there, > > I'm trying to understand difference in latency reported by fdatasync > latency percentiles (_sync percentiles_) on `--ioengine=psync -- > fdatasync=1` vs. _clat percentiles_ on `--ioengine=libaio --direct=1` and > I'd really appreciate your help. > > I thought that they should should not have such a big difference > between each other. I've assumed (am I wrong here?) that fdatasync > latency percentiles (_sync percentiles_) will report on I/O completion for > `fdatasync()` syscall, so that I could compare it with _clat percentiles_ on > `--ioengine=libaio --direct=1`. > > Can someone please help to explain how fdatasync latency percentiles > (_sync percentiles_) on `--ioengine=psync --fdatasync=1` are calculated > exactly? > > My goal is to reproduce application pattern where for each of its > `pwrite64()` calls it executes `fdatasync()` + I want to measure > `fdatasync()` completion latency. > > Test A: > $ sudo fio --rw=write --ioengine=psync --fdatasync=1 --iodepth=1 -- > bs=512 --numjobs=1 --rate_iops=1 --directory=test-data --size=100mb -- > runtime=1m --name=azure-westeurope-disk-test > azure-westeurope-disk-test: (g=0): rw=write, bs=(R) 512B-512B, (W) > 512B-512B, (T) 512B-512B, ioengine=psync, iodepth=1 > fio-3.7 > Starting 1 process > azure-westeurope-disk-test: Laying out IO file (1 file / 100MiB) > Jobs: 1 (f=1), 0-1 IOPS: [W(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=1 > IOPS][eta 00m:00s] > azure-westeurope-disk-test: (groupid=0, jobs=1): err= 0: pid=12184: Tue > Jul 12 18:35:15 2022 > write: IOPS=1, BW=520B/s (520B/s)(30.5KiB/60001msec) > clat (nsec): min=9400, max=44200, avg=29336.07, stdev=6777.19 > lat (nsec): min=10300, max=48300, avg=31836.07, stdev=7118.83 > clat percentiles (nsec): > | 1.00th=[ 9408], 5.00th=[15808], 10.00th=[25728], 20.00th=[26496], > | 30.00th=[27008], 40.00th=[27776], 50.00th=[28288], 60.00th=[29312], > | 70.00th=[32640], 80.00th=[33536], 90.00th=[35584], 95.00th=[42240], > | 99.00th=[44288], 99.50th=[44288], 99.90th=[44288], 99.95th=[44288], > | 99.99th=[44288] > bw ( KiB/s): min= 0, max= 2, per=100.00%, avg= 1.00, stdev= 0.19, > samples=59 > iops : min= 1, max= 4, avg= 2.02, stdev= 0.29, samples=59 > lat (usec) : 10=1.64%, 20=6.56%, 50=91.80% > fsync/fdatasync/sync_file_range: > sync (usec): min=1112, max=8494, avg=1999.62, stdev=1242.89 > sync percentiles (usec): > | 1.00th=[ 1106], 5.00th=[ 1172], 10.00th=[ 1205], 20.00th=[ 1287], > | 30.00th=[ 1369], 40.00th=[ 1434], 50.00th=[ 1516], 60.00th=[ 1631], > | 70.00th=[ 1860], 80.00th=[ 2507], 90.00th=[ 3326], 95.00th=[ 4293], > | 99.00th=[ 8455], 99.50th=[ 8455], 99.90th=[ 8455], 99.95th=[ 8455], > | 99.99th=[ 8455] > cpu : usr=0.01%, sys=0.01%, ctx=136, majf=0, minf=34 > IO depths : 1=198.4%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, > >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=0,61,0,0 short=60,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=1 > > Run status group 0 (all jobs): > WRITE: bw=520B/s (520B/s), 520B/s-520B/s (520B/s-520B/s), io=30.5KiB > (31.2kB), run=60001-60001msec > > Disk stats (read/write): > sdb: ios=0/282, merge=0/22, ticks=0/482, in_queue=482, util=0.56% > > > > Test B: > $ sudo fio --rw=write --ioengine=libaio --direct=1 --iodepth=1 --bs=512 -- > numjobs=1 --rate_iops=1 --directory=test-data --size=100mb -- > runtime=1m --name=azure-westeurope-disk-test > azure-westeurope-disk-test: (g=0): rw=write, bs=(R) 512B-512B, (W) > 512B-512B, (T) 512B-512B, ioengine=libaio, iodepth=1 > fio-3.7 > Starting 1 process > Jobs: 1 (f=1), 0-1 IOPS: [W(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=1 > IOPS][eta 00m:00s] > azure-westeurope-disk-test: (groupid=0, jobs=1): err= 0: pid=12291: Tue > Jul 12 18:36:35 2022 > write: IOPS=1, BW=520B/s (520B/s)(30.5KiB/60002msec) > slat (usec): min=1140, max=4786, avg=1654.72, stdev=693.43 > clat (nsec): min=3300, max=12200, avg=5070.49, stdev=1429.96 > lat (usec): min=1148, max=4794, avg=1662.44, stdev=693.39 > clat percentiles (nsec): > | 1.00th=[ 3312], 5.00th=[ 4320], 10.00th=[ 4320], 20.00th=[ 4384], > | 30.00th=[ 4576], 40.00th=[ 4704], 50.00th=[ 4704], 60.00th=[ 4832], > | 70.00th=[ 4832], 80.00th=[ 5088], 90.00th=[ 5472], 95.00th=[ 9152], > | 99.00th=[12224], 99.50th=[12224], 99.90th=[12224], 99.95th=[12224], > | 99.99th=[12224] > bw ( KiB/s): min= 0, max= 2, per=100.00%, avg= 1.00, stdev= 0.19, > samples=59 > iops : min= 1, max= 4, avg= 2.02, stdev= 0.29, samples=59 > lat (usec) : 4=1.64%, 10=96.72%, 20=1.64% > cpu : usr=0.00%, sys=0.01%, ctx=122, majf=0, minf=32 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, > >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=0,61,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=1 > > Run status group 0 (all jobs): > WRITE: bw=520B/s (520B/s), 520B/s-520B/s (520B/s-520B/s), io=30.5KiB > (31.2kB), run=60002-60002msec > > Disk stats (read/write): > sdb: ios=0/301, merge=0/18, ticks=0/432, in_queue=432, util=0.62% > > > Thanks a lot in advance! > Alex Fio has an issue where strange latency values are reported when the rate_iops option is in use: https://github.com/axboe/fio/issues/1394 Until that is resolved it is unlikely that experiments like the ones above that use rate_iops will lead to meaningful results. Try running your experiments without the rate_iops option.