Re: Measuring IOPS (solved, I think)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Freitag, 29. Juli 2011 schrieb Martin Steigerwald:
> Am Freitag, 29. Juli 2011 schrieb Martin Steigerwald:
> > Hi!
> > 
> > I am currently writing an article about fio for a german print
> > magazine after having packaged it for Debian and using it in
> > performance analysis & tuning trainings.
> > 
> > After introducting into the concepts of fio with some basic job files
> > I´d like how to do meaningful IOPS measurements that also work with
> > SSDs that compress.
> > 
> > For some first tests I came up with:
> > 
> > martin@merkaba:~[…]> cat iops.job
> > [global]
> > size=2G
> > bsrange=2-16k
> > filename=iops1
> > numjobs=1
> > iodepth=1
> > # Zufällige Daten für SSDs, die komprimieren
> > refill_buffers=1
> > 
> > [zufälligschreiben]
> > rw=randwrite
> > stonewall
> > [sequentiellschreiben]
> > rw=write
> > stonewall
> > 
> > [zufälliglesen]
> > rw=randread
> > stonewall
> > [sequentielllesen]
> > rw=read
> > 
> > (small german dictionary:
> > - zufällig => random
> > - lesen => read
> > - schreiben => write;)
> 
> [...]
> 
> > Do you think the above job file could give realistic results? Any
> > suggestions?
> 
> > I got these results:
> With a simpler read job I have different results that puzzle me:
> 
> martin@merkaba:~/Artikel/LinuxNewMedia/fio/Recherche/fio> cat
> zweierlei- lesen-2gb-variable-blockgrößen.job
> [global]
> rw=randread
> size=2g
> bsrange=2-16k
> 
> [zufälliglesen]
> stonewall
> [sequentielllesen]
> rw=read
[...]

> What´s going on here?  Where does the difference between 6389 IOPS for
> this simpler read job file versus 60635 IOPS for the IOPS job file
> come from? These results compared to the results from the IOPS job do
> not make sense to me. Is it just random versus zeros? Which values are
> more realistic? I thought on an SSD random I/O versus sequential I/O
> should cause such a big difference.
> 
> Files are laid out as follows:
> 
> martin@merkaba:~[…]> sudo filefrag zufälliglesen.1.0
> sequentielllesen.2.0 iops1
> zufälliglesen.1.0: 17 extents found
> sequentielllesen.2.0: 17 extents found
> iops1: 258 extents found
> 
> Not that it should matter much on an SSD.
> 
> This is on an ThinkPad T520 with Intel i5 Sandybridge Dual Core, 8 GB
> of RAM and said Intel SSD 320. On Ext4 on LVM with Linux 3.0.0 debian
> package.

I think I found it.

It depends on whether I specify a filename explicitely *and* globally or 
not. When I specify it the job runs fast. No matter whether there are 
zeros or random data in the file - as expected for this SSD. When I do not 
specify it or specify it in a section the job runs slow.


Now follows the complete investigation and explaination of why this was 
so:

The only difference between job files is:

martin@merkaba:~[...]> cat zweierlei-lesen-2gb-variable-blockgrößen-
jobfile-given.job
[global]
rw=randread
size=2g
bsrange=2-16k
filename=zufälliglesen.1.0

[zufälliglesen]
stonewall
[sequentielllesen]
rw=read

The only difference of the implicit-jobfile job is that I commented the 
filename option. But since the filename matches that what fio would choose by 
itself, fio in both cases should use *the same* file.


Steps to (hopefully) reproduce it:

1. do the following once to have the job files created: fio zweierlei-
lesen-2gb-variable-blockgrößen-jobfile-given.job

2. do 

su -c "echo 3 > /proc/sys/vm/drop_caches" ; fio zweierlei-lesen-2gb-
variable-blockgrößen-jobfile-implicit.job > zweierlei-lesen-2gb-variable-
blockgrößen-jobfile-implicit.results

fio runs slow.

3. do

su -c "echo 3 > /proc/sys/vm/drop_caches" ; fio zweierlei-lesen-2gb-
variable-blockgrößen-jobfile-given.job > zweierlei-lesen-2gb-variable-
blockgrößen-jobfile-given.results

fio runs fast.

4. Use kompare, vimdiff or other side by side diff to compare the results.


I do think that echo 3 > /proc/sys/vm/drop_caches is not needed, as far as 
I understand, fio clears caches if not told otherwise.


Aside from the speed difference I only found one difference that might 
explain this fast

  cpu          : usr=11.54%, sys=9.26%, ctx=3392, majf=0, minf=28

versus  this slow

  cpu          : usr=4.87%, sys=11.20%, ctx=261968, majf=0, minf=26


Any hints why giving or not giving the filename makes such a big difference?

I also tested whether this might be an UTF-8 issue and rewrote the job 
files to not use any umlauts. That didn´t make any difference.

I made it a bit more narrow even:

martin@merkaba:~[...]> diff -u zweierlei-lesen-2gb-variable-blockgroessen-
jobfile-given-in-section-no-utf8.job zweierlei-lesen-2gb-variable-
blockgroessen-jobfile-given-no-utf8.job 
--- zweierlei-lesen-2gb-variable-blockgroessen-jobfile-given-in-section-no-
utf8.job     2011-08-02 15:35:41.246226877 +0200
+++ zweierlei-lesen-2gb-variable-blockgroessen-jobfile-given-no-utf8.job        
2011-08-02 15:50:00.073095677 +0200
@@ -2,9 +2,9 @@
 rw=randread
 size=2g
 bsrange=2-16k
+filename=zufaelliglesen.1.0
 
 [zufaelliglesen]
-filename=zufaelliglesen.1.0
 stonewall
 [sequentielllesen]
 rw=read

makes the difference.


Okay, and now I understand it:

As I see from the progress display I see that fio runs the second job first. 
When the filename is in the global section, both jobs use the same file. And 
with the missing stonewall option in the second job section the sequential 
read job even runs in parallel. I wondered whether I had [rR] there.


Okay, then I know:

When I want to have mutiple jobs run one after another I need to do a 
stonewall argument in *each* job. *Also the last one*, cause in the notion 
of fio there is no last job, as fio sets up each job before it starts job 
execution. And it seems that everything it runs everything that has no 
stonewall option right away even if an earlier defined job file has a 
stonewall option. Fio only thinks sequentially for the jobs that have a 
stonewall option which might be a disadvantage, if I want to run groups of 
jobs one after another:

[readjob]
blabla

[parallelwritejob]
blabla
stonewall

[randomreadjob]
blabla

[sequentialreadjob]
blabla
stonewall

As far as I understand it, Fio would then run the both jobs without a 
stonewall option straight away while it also starts the first job with a 
stonewall option. Then, the jobs without the stonewall option might still 
running or not, fio starts the second job without the stonewall option.

So I understand it now. Personally I would prefer, if fio runs the first two 
jobs in parallel then does the stonewall, and then the second two jobs in 
parallel. Is this possible somehow?



With an additional "stonewall" for the last job in the iops.job file I made 
up it also works when the filename is specified globally.

Thus there we go:

martin@merkaba:~[...]> fio iops-done-right.job 
zufälligschreiben: (g=0): rw=randwrite, bs=2-16K/2-16K, ioengine=sync, 
iodepth=1
sequentiellschreiben: (g=1): rw=write, bs=2-16K/2-16K, ioengine=sync, 
iodepth=1
zufälliglesen: (g=2): rw=randread, bs=2-16K/2-16K, ioengine=sync, 
iodepth=1
sequentielllesen: (g=3): rw=read, bs=2-16K/2-16K, ioengine=sync, iodepth=1
fio 1.57
Starting 4 processes
Jobs: 1 (f=1): [___R] [100.0% done] [268.3M/0K /s] [33.6K/0  iops] [eta 
00m:00s]                
zufälligschreiben: (groupid=0, jobs=1): err= 0: pid=20474
  write: io=2048.0MB, bw=16686KB/s, iops=5096 , runt=125687msec
    clat (usec): min=0 , max=292792 , avg=188.16, stdev=933.01
     lat (usec): min=0 , max=292792 , avg=188.42, stdev=933.04
    bw (KB/s) : min=  320, max=63259, per=100.00%, avg=16684.86, 
stdev=9052.82
  cpu          : usr=4.60%, sys=13.58%, ctx=344489, majf=0, minf=31
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     issued r/w/d: total=0/640575/0, short=0/0/0
     lat (usec): 2=4.55%, 4=23.30%, 10=27.49%, 20=5.00%, 50=2.07%
     lat (usec): 100=0.09%, 250=12.02%, 500=20.41%, 750=3.46%, 1000=0.11%
     lat (msec): 2=0.31%, 4=0.78%, 10=0.40%, 20=0.01%, 50=0.01%
     lat (msec): 100=0.01%, 250=0.01%, 500=0.01%
sequentiellschreiben: (groupid=1, jobs=1): err= 0: pid=20482
  write: io=2048.0MB, bw=49401KB/s, iops=6176 , runt= 42452msec
    clat (usec): min=0 , max=213632 , avg=132.32, stdev=1355.16
     lat (usec): min=1 , max=213632 , avg=132.65, stdev=1355.16
    bw (KB/s) : min=    2, max=79902, per=110.53%, avg=54600.95, 
stdev=24636.03
  cpu          : usr=10.83%, sys=21.02%, ctx=232933, majf=0, minf=34
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     issued r/w/d: total=0/262197/0, short=0/0/0
     lat (usec): 2=0.09%, 4=1.07%, 10=10.43%, 20=0.89%, 50=0.07%
     lat (usec): 100=35.03%, 250=51.57%, 500=0.57%, 750=0.03%, 1000=0.01%
     lat (msec): 2=0.04%, 4=0.08%, 10=0.02%, 20=0.06%, 50=0.01%
     lat (msec): 100=0.03%, 250=0.01%
zufälliglesen: (groupid=2, jobs=1): err= 0: pid=20484
  read : io=2048.0MB, bw=23151KB/s, iops=7050 , runt= 90584msec
    clat (usec): min=0 , max=70235 , avg=134.92, stdev=212.70
     lat (usec): min=0 , max=70236 , avg=135.16, stdev=212.79
    bw (KB/s) : min=    5, max=118959, per=100.36%, avg=23233.61, 
stdev=12885.69
  cpu          : usr=4.55%, sys=13.30%, ctx=259109, majf=0, minf=27
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     issued r/w/d: total=638666/0/0, short=0/0/0
     lat (usec): 2=21.65%, 4=30.12%, 10=7.42%, 20=0.69%, 50=0.06%
     lat (usec): 100=0.01%, 250=14.58%, 500=21.58%, 750=3.85%, 1000=0.01%
     lat (msec): 2=0.02%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
     lat (msec): 100=0.01%
sequentielllesen: (groupid=3, jobs=1): err= 0: pid=20820
  read : io=2048.0MB, bw=267392KB/s, iops=33432 , runt=  7843msec
    clat (usec): min=0 , max=4098 , avg=28.40, stdev=143.49
     lat (usec): min=0 , max=4098 , avg=28.51, stdev=143.49
    bw (KB/s) : min=176584, max=275993, per=100.04%, avg=267511.13, 
stdev=25285.86
  cpu          : usr=4.18%, sys=21.27%, ctx=8616, majf=0, minf=29
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     issued r/w/d: total=262210/0/0, short=0/0/0
     lat (usec): 2=26.04%, 4=47.22%, 10=21.85%, 20=1.61%, 50=0.13%
     lat (usec): 100=0.02%, 250=0.01%, 500=0.01%, 750=0.14%, 1000=2.96%
     lat (msec): 2=0.01%, 4=0.01%, 10=0.01%

Run status group 0 (all jobs):
  WRITE: io=2048.0MB, aggrb=16685KB/s, minb=17085KB/s, maxb=17085KB/s, 
mint=125687msec, maxt=125687msec

Run status group 1 (all jobs):
  WRITE: io=2048.0MB, aggrb=49400KB/s, minb=50586KB/s, maxb=50586KB/s, 
mint=42452msec, maxt=42452msec

Run status group 2 (all jobs):
   READ: io=2048.0MB, aggrb=23151KB/s, minb=23707KB/s, maxb=23707KB/s, 
mint=90584msec, maxt=90584msec

Run status group 3 (all jobs):
   READ: io=2048.0MB, aggrb=267391KB/s, minb=273808KB/s, maxb=273808KB/s, 
mint=7843msec, maxt=7843msec

Disk stats (read/write):
  dm-2: ios=832862/416089, merge=0/0, ticks=206456/6699696, 
in_queue=6907728, util=78.36%, aggrios=833046/418069, aggrmerge=95/663, 
aggrticks=206032/6668768, aggrin_queue=6873712, aggrutil=77.80%
    sda: ios=833046/418069, merge=95/663, ticks=206032/6668768, 
in_queue=6873712, util=77.80%

martin@merkaba:~[...]> diff -u iops.job iops-done-right.job 
--- iops.job    2011-07-29 16:40:41.776809061 +0200
+++ iops-done-right.job 2011-08-02 16:15:06.055626894 +0200
@@ -19,4 +19,5 @@
 stonewall
 [sequentielllesen]
 rw=read
+stonewall


So always question results that don´t make sense.

May this serve as pointer should anyone stumple upon something like this 
;)

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
[global]
rw=randread
size=2g
bsrange=2-16k
filename=zufälliglesen.1.0

[zufälliglesen]
stonewall
[sequentielllesen]
rw=read
zufälliglesen: (g=0): rw=randread, bs=2-16K/2-16K, ioengine=sync, iodepth=1
sequentielllesen: (g=0): rw=read, bs=2-16K/2-16K, ioengine=sync, iodepth=1
fio 1.57
Starting 2 processes

zufälliglesen: (groupid=0, jobs=1): err= 0: pid=14723
  read : io=2048.0MB, bw=186066KB/s, iops=56794 , runt= 11271msec
    clat (usec): min=0 , max=103559 , avg=15.57, stdev=1069.35
     lat (usec): min=0 , max=103559 , avg=15.62, stdev=1069.35
    bw (KB/s) : min=  243, max=843245, per=53.37%, avg=198607.29, stdev=327979.47
  cpu          : usr=11.54%, sys=9.26%, ctx=3392, majf=0, minf=28
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=640132/0/0, short=0/0/0
     lat (usec): 2=67.88%, 4=25.43%, 10=5.79%, 20=0.36%, 50=0.05%
     lat (usec): 100=0.01%, 250=0.15%, 500=0.26%, 750=0.04%, 1000=0.01%
     lat (msec): 2=0.01%, 4=0.03%, 10=0.01%, 20=0.01%, 50=0.01%
     lat (msec): 100=0.01%, 250=0.01%
sequentielllesen: (groupid=0, jobs=1): err= 0: pid=14724
  read : io=2048.0MB, bw=255813KB/s, iops=31989 , runt=  8198msec
    clat (usec): min=0 , max=22658 , avg=30.46, stdev=171.40
     lat (usec): min=0 , max=22658 , avg=30.56, stdev=171.39
    bw (KB/s) : min=240308, max=264524, per=68.85%, avg=256220.81, stdev=6556.93
  cpu          : usr=4.39%, sys=17.18%, ctx=8630, majf=0, minf=28
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=262251/0/0, short=0/0/0
     lat (usec): 2=26.88%, 4=48.21%, 10=19.94%, 20=1.59%, 50=0.16%
     lat (usec): 100=0.09%, 250=0.04%, 500=0.05%, 750=0.16%, 1000=2.60%
     lat (msec): 2=0.21%, 4=0.06%, 50=0.01%

Run status group 0 (all jobs):
   READ: io=4096.0MB, aggrb=372132KB/s, minb=190531KB/s, maxb=261952KB/s, mint=8198msec, maxt=11271msec

Disk stats (read/write):
  dm-2: ios=11567/15, merge=0/0, ticks=23592/508, in_queue=24100, util=79.23%, aggrios=11473/353, aggrmerge=156/14, aggrticks=24556/2332, aggrin_queue=26864, aggrutil=77.77%
    sda: ios=11473/353, merge=156/14, ticks=24556/2332, in_queue=26864, util=77.77%
[global]
rw=randread
size=2g
bsrange=2-16k
#filename=zufälliglesen.1.0

[zufälliglesen]
stonewall
[sequentielllesen]
rw=read
zufälliglesen: (g=0): rw=randread, bs=2-16K/2-16K, ioengine=sync, iodepth=1
sequentielllesen: (g=0): rw=read, bs=2-16K/2-16K, ioengine=sync, iodepth=1
fio 1.57
Starting 2 processes

zufälliglesen: (groupid=0, jobs=1): err= 0: pid=14745
  read : io=2048.0MB, bw=21516KB/s, iops=6567 , runt= 97469msec
    clat (usec): min=0 , max=103566 , avg=146.45, stdev=1099.54
     lat (usec): min=0 , max=103566 , avg=146.68, stdev=1099.56
    bw (KB/s) : min=  132, max=118508, per=49.82%, avg=21437.43, stdev=13479.42
  cpu          : usr=4.87%, sys=11.20%, ctx=261968, majf=0, minf=26
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=640153/0/0, short=0/0/0
     lat (usec): 2=21.55%, 4=30.22%, 10=7.16%, 20=0.58%, 50=0.06%
     lat (usec): 100=0.01%, 250=15.55%, 500=21.14%, 750=3.68%, 1000=0.01%
     lat (msec): 2=0.01%, 4=0.03%, 20=0.01%, 50=0.01%, 100=0.01%
     lat (msec): 250=0.01%
sequentielllesen: (groupid=0, jobs=1): err= 0: pid=14746
  read : io=2048.0MB, bw=256909KB/s, iops=32115 , runt=  8163msec
    clat (usec): min=0 , max=3979 , avg=30.33, stdev=166.59
     lat (usec): min=0 , max=3979 , avg=30.44, stdev=166.58
    bw (KB/s) : min=245618, max=267008, per=597.43%, avg=257087.19, stdev=6569.89
  cpu          : usr=3.97%, sys=17.30%, ctx=8475, majf=0, minf=28
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=262162/0/0, short=0/0/0
     lat (usec): 2=28.30%, 4=47.36%, 10=19.41%, 20=1.64%, 50=0.15%
     lat (usec): 100=0.06%, 250=0.01%, 500=0.01%, 750=0.11%, 1000=2.69%
     lat (msec): 2=0.20%, 4=0.06%

Run status group 0 (all jobs):
   READ: io=4096.0MB, aggrb=43032KB/s, minb=22032KB/s, maxb=263075KB/s, mint=8163msec, maxt=97469msec

Disk stats (read/write):
  dm-2: ios=267158/84, merge=0/0, ticks=94852/676, in_queue=95528, util=82.25%, aggrios=267040/962, aggrmerge=164/37, aggrticks=94356/5816, aggrin_queue=99820, aggrutil=81.77%
    sda: ios=267040/962, merge=164/37, ticks=94356/5816, in_queue=99820, util=81.77%

[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux