RE: fdatasync latency percentiles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Alex Matyushin [mailto:airgunster@xxxxxxxxxxxxx]
> Sent: Tuesday, July 12, 2022 2:59 PM
> To: fio@xxxxxxxxxxxxxxx
> Subject: fdatasync latency percentiles
> 
> Hi there,
> 
> I'm trying to understand difference in latency reported by fdatasync
> latency percentiles (_sync percentiles_) on `--ioengine=psync --
> fdatasync=1` vs. _clat percentiles_ on `--ioengine=libaio --direct=1` and
> I'd really appreciate your help.
> 
> I thought that they should should not have such a big difference
> between each other. I've assumed (am I wrong here?) that fdatasync
> latency percentiles (_sync percentiles_) will report on I/O completion for
> `fdatasync()` syscall, so that I could compare it with _clat percentiles_ on
> `--ioengine=libaio --direct=1`.
> 
> Can someone please help to explain how fdatasync latency percentiles
> (_sync percentiles_) on `--ioengine=psync --fdatasync=1` are calculated
> exactly?
> 
> My goal is to reproduce application pattern where for each of its
> `pwrite64()` calls it executes `fdatasync()` + I want to measure
> `fdatasync()` completion latency.
> 
> Test A:
> $ sudo fio --rw=write --ioengine=psync --fdatasync=1 --iodepth=1 --
> bs=512 --numjobs=1 --rate_iops=1 --directory=test-data --size=100mb --
> runtime=1m --name=azure-westeurope-disk-test
> azure-westeurope-disk-test: (g=0): rw=write, bs=(R) 512B-512B, (W)
> 512B-512B, (T) 512B-512B, ioengine=psync, iodepth=1
> fio-3.7
> Starting 1 process
> azure-westeurope-disk-test: Laying out IO file (1 file / 100MiB)
> Jobs: 1 (f=1), 0-1 IOPS: [W(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=1
> IOPS][eta 00m:00s]
> azure-westeurope-disk-test: (groupid=0, jobs=1): err= 0: pid=12184: Tue
> Jul 12 18:35:15 2022
>   write: IOPS=1, BW=520B/s (520B/s)(30.5KiB/60001msec)
>     clat (nsec): min=9400, max=44200, avg=29336.07, stdev=6777.19
>      lat (nsec): min=10300, max=48300, avg=31836.07, stdev=7118.83
>     clat percentiles (nsec):
>      |  1.00th=[ 9408],  5.00th=[15808], 10.00th=[25728], 20.00th=[26496],
>      | 30.00th=[27008], 40.00th=[27776], 50.00th=[28288], 60.00th=[29312],
>      | 70.00th=[32640], 80.00th=[33536], 90.00th=[35584], 95.00th=[42240],
>      | 99.00th=[44288], 99.50th=[44288], 99.90th=[44288], 99.95th=[44288],
>      | 99.99th=[44288]
>    bw (  KiB/s): min=    0, max=    2, per=100.00%, avg= 1.00, stdev= 0.19,
> samples=59
>    iops        : min=    1, max=    4, avg= 2.02, stdev= 0.29, samples=59
>   lat (usec)   : 10=1.64%, 20=6.56%, 50=91.80%
>   fsync/fdatasync/sync_file_range:
>     sync (usec): min=1112, max=8494, avg=1999.62, stdev=1242.89
>     sync percentiles (usec):
>      |  1.00th=[ 1106],  5.00th=[ 1172], 10.00th=[ 1205], 20.00th=[ 1287],
>      | 30.00th=[ 1369], 40.00th=[ 1434], 50.00th=[ 1516], 60.00th=[ 1631],
>      | 70.00th=[ 1860], 80.00th=[ 2507], 90.00th=[ 3326], 95.00th=[ 4293],
>      | 99.00th=[ 8455], 99.50th=[ 8455], 99.90th=[ 8455], 99.95th=[ 8455],
>      | 99.99th=[ 8455]
>   cpu          : usr=0.01%, sys=0.01%, ctx=136, majf=0, minf=34
>   IO depths    : 1=198.4%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      issued rwts: total=0,61,0,0 short=60,0,0,0 dropped=0,0,0,0
>      latency   : target=0, window=0, percentile=100.00%, depth=1
> 
> Run status group 0 (all jobs):
>   WRITE: bw=520B/s (520B/s), 520B/s-520B/s (520B/s-520B/s), io=30.5KiB
> (31.2kB), run=60001-60001msec
> 
> Disk stats (read/write):
>   sdb: ios=0/282, merge=0/22, ticks=0/482, in_queue=482, util=0.56%
> 
> 
> 
> Test B:
> $ sudo fio --rw=write --ioengine=libaio --direct=1 --iodepth=1 --bs=512 --
> numjobs=1 --rate_iops=1 --directory=test-data --size=100mb --
> runtime=1m --name=azure-westeurope-disk-test
> azure-westeurope-disk-test: (g=0): rw=write, bs=(R) 512B-512B, (W)
> 512B-512B, (T) 512B-512B, ioengine=libaio, iodepth=1
> fio-3.7
> Starting 1 process
> Jobs: 1 (f=1), 0-1 IOPS: [W(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=1
> IOPS][eta 00m:00s]
> azure-westeurope-disk-test: (groupid=0, jobs=1): err= 0: pid=12291: Tue
> Jul 12 18:36:35 2022
>   write: IOPS=1, BW=520B/s (520B/s)(30.5KiB/60002msec)
>     slat (usec): min=1140, max=4786, avg=1654.72, stdev=693.43
>     clat (nsec): min=3300, max=12200, avg=5070.49, stdev=1429.96
>      lat (usec): min=1148, max=4794, avg=1662.44, stdev=693.39
>     clat percentiles (nsec):
>      |  1.00th=[ 3312],  5.00th=[ 4320], 10.00th=[ 4320], 20.00th=[ 4384],
>      | 30.00th=[ 4576], 40.00th=[ 4704], 50.00th=[ 4704], 60.00th=[ 4832],
>      | 70.00th=[ 4832], 80.00th=[ 5088], 90.00th=[ 5472], 95.00th=[ 9152],
>      | 99.00th=[12224], 99.50th=[12224], 99.90th=[12224], 99.95th=[12224],
>      | 99.99th=[12224]
>    bw (  KiB/s): min=    0, max=    2, per=100.00%, avg= 1.00, stdev= 0.19,
> samples=59
>    iops        : min=    1, max=    4, avg= 2.02, stdev= 0.29, samples=59
>   lat (usec)   : 4=1.64%, 10=96.72%, 20=1.64%
>   cpu          : usr=0.00%, sys=0.01%, ctx=122, majf=0, minf=32
>   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      issued rwts: total=0,61,0,0 short=0,0,0,0 dropped=0,0,0,0
>      latency   : target=0, window=0, percentile=100.00%, depth=1
> 
> Run status group 0 (all jobs):
>   WRITE: bw=520B/s (520B/s), 520B/s-520B/s (520B/s-520B/s), io=30.5KiB
> (31.2kB), run=60002-60002msec
> 
> Disk stats (read/write):
>   sdb: ios=0/301, merge=0/18, ticks=0/432, in_queue=432, util=0.62%
> 
> 
> Thanks a lot in advance!
> Alex

Fio has an issue where strange latency values are reported when the rate_iops option is in use:

https://github.com/axboe/fio/issues/1394

Until that is resolved it is unlikely that experiments like the ones above that use rate_iops will lead to meaningful results. Try running your experiments without the rate_iops option.




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux