Am Freitag, 29. Juli 2011 schrieb Martin Steigerwald: > Am Freitag, 29. Juli 2011 schrieb Martin Steigerwald: > > Hi! > > > > I am currently writing an article about fio for a german print > > magazine after having packaged it for Debian and using it in > > performance analysis & tuning trainings. > > > > After introducting into the concepts of fio with some basic job files > > I´d like how to do meaningful IOPS measurements that also work with > > SSDs that compress. > > > > For some first tests I came up with: > > > > martin@merkaba:~[…]> cat iops.job > > [global] > > size=2G > > bsrange=2-16k > > filename=iops1 > > numjobs=1 > > iodepth=1 > > # Zufällige Daten für SSDs, die komprimieren > > refill_buffers=1 > > > > [zufälligschreiben] > > rw=randwrite > > stonewall > > [sequentiellschreiben] > > rw=write > > stonewall > > > > [zufälliglesen] > > rw=randread > > stonewall > > [sequentielllesen] > > rw=read > > > > (small german dictionary: > > - zufällig => random > > - lesen => read > > - schreiben => write;) > > [...] > > > Do you think the above job file could give realistic results? Any > > suggestions? > > > I got these results: > With a simpler read job I have different results that puzzle me: > > martin@merkaba:~/Artikel/LinuxNewMedia/fio/Recherche/fio> cat > zweierlei- lesen-2gb-variable-blockgrößen.job > [global] > rw=randread > size=2g > bsrange=2-16k > > [zufälliglesen] > stonewall > [sequentielllesen] > rw=read [...] > What´s going on here? Where does the difference between 6389 IOPS for > this simpler read job file versus 60635 IOPS for the IOPS job file > come from? These results compared to the results from the IOPS job do > not make sense to me. Is it just random versus zeros? Which values are > more realistic? I thought on an SSD random I/O versus sequential I/O > should cause such a big difference. > > Files are laid out as follows: > > martin@merkaba:~[…]> sudo filefrag zufälliglesen.1.0 > sequentielllesen.2.0 iops1 > zufälliglesen.1.0: 17 extents found > sequentielllesen.2.0: 17 extents found > iops1: 258 extents found > > Not that it should matter much on an SSD. > > This is on an ThinkPad T520 with Intel i5 Sandybridge Dual Core, 8 GB > of RAM and said Intel SSD 320. On Ext4 on LVM with Linux 3.0.0 debian > package. I think I found it. It depends on whether I specify a filename explicitely *and* globally or not. When I specify it the job runs fast. No matter whether there are zeros or random data in the file - as expected for this SSD. When I do not specify it or specify it in a section the job runs slow. Now follows the complete investigation and explaination of why this was so: The only difference between job files is: martin@merkaba:~[...]> cat zweierlei-lesen-2gb-variable-blockgrößen- jobfile-given.job [global] rw=randread size=2g bsrange=2-16k filename=zufälliglesen.1.0 [zufälliglesen] stonewall [sequentielllesen] rw=read The only difference of the implicit-jobfile job is that I commented the filename option. But since the filename matches that what fio would choose by itself, fio in both cases should use *the same* file. Steps to (hopefully) reproduce it: 1. do the following once to have the job files created: fio zweierlei- lesen-2gb-variable-blockgrößen-jobfile-given.job 2. do su -c "echo 3 > /proc/sys/vm/drop_caches" ; fio zweierlei-lesen-2gb- variable-blockgrößen-jobfile-implicit.job > zweierlei-lesen-2gb-variable- blockgrößen-jobfile-implicit.results fio runs slow. 3. do su -c "echo 3 > /proc/sys/vm/drop_caches" ; fio zweierlei-lesen-2gb- variable-blockgrößen-jobfile-given.job > zweierlei-lesen-2gb-variable- blockgrößen-jobfile-given.results fio runs fast. 4. Use kompare, vimdiff or other side by side diff to compare the results. I do think that echo 3 > /proc/sys/vm/drop_caches is not needed, as far as I understand, fio clears caches if not told otherwise. Aside from the speed difference I only found one difference that might explain this fast cpu : usr=11.54%, sys=9.26%, ctx=3392, majf=0, minf=28 versus this slow cpu : usr=4.87%, sys=11.20%, ctx=261968, majf=0, minf=26 Any hints why giving or not giving the filename makes such a big difference? I also tested whether this might be an UTF-8 issue and rewrote the job files to not use any umlauts. That didn´t make any difference. I made it a bit more narrow even: martin@merkaba:~[...]> diff -u zweierlei-lesen-2gb-variable-blockgroessen- jobfile-given-in-section-no-utf8.job zweierlei-lesen-2gb-variable- blockgroessen-jobfile-given-no-utf8.job --- zweierlei-lesen-2gb-variable-blockgroessen-jobfile-given-in-section-no- utf8.job 2011-08-02 15:35:41.246226877 +0200 +++ zweierlei-lesen-2gb-variable-blockgroessen-jobfile-given-no-utf8.job 2011-08-02 15:50:00.073095677 +0200 @@ -2,9 +2,9 @@ rw=randread size=2g bsrange=2-16k +filename=zufaelliglesen.1.0 [zufaelliglesen] -filename=zufaelliglesen.1.0 stonewall [sequentielllesen] rw=read makes the difference. Okay, and now I understand it: As I see from the progress display I see that fio runs the second job first. When the filename is in the global section, both jobs use the same file. And with the missing stonewall option in the second job section the sequential read job even runs in parallel. I wondered whether I had [rR] there. Okay, then I know: When I want to have mutiple jobs run one after another I need to do a stonewall argument in *each* job. *Also the last one*, cause in the notion of fio there is no last job, as fio sets up each job before it starts job execution. And it seems that everything it runs everything that has no stonewall option right away even if an earlier defined job file has a stonewall option. Fio only thinks sequentially for the jobs that have a stonewall option which might be a disadvantage, if I want to run groups of jobs one after another: [readjob] blabla [parallelwritejob] blabla stonewall [randomreadjob] blabla [sequentialreadjob] blabla stonewall As far as I understand it, Fio would then run the both jobs without a stonewall option straight away while it also starts the first job with a stonewall option. Then, the jobs without the stonewall option might still running or not, fio starts the second job without the stonewall option. So I understand it now. Personally I would prefer, if fio runs the first two jobs in parallel then does the stonewall, and then the second two jobs in parallel. Is this possible somehow? With an additional "stonewall" for the last job in the iops.job file I made up it also works when the filename is specified globally. Thus there we go: martin@merkaba:~[...]> fio iops-done-right.job zufälligschreiben: (g=0): rw=randwrite, bs=2-16K/2-16K, ioengine=sync, iodepth=1 sequentiellschreiben: (g=1): rw=write, bs=2-16K/2-16K, ioengine=sync, iodepth=1 zufälliglesen: (g=2): rw=randread, bs=2-16K/2-16K, ioengine=sync, iodepth=1 sequentielllesen: (g=3): rw=read, bs=2-16K/2-16K, ioengine=sync, iodepth=1 fio 1.57 Starting 4 processes Jobs: 1 (f=1): [___R] [100.0% done] [268.3M/0K /s] [33.6K/0 iops] [eta 00m:00s] zufälligschreiben: (groupid=0, jobs=1): err= 0: pid=20474 write: io=2048.0MB, bw=16686KB/s, iops=5096 , runt=125687msec clat (usec): min=0 , max=292792 , avg=188.16, stdev=933.01 lat (usec): min=0 , max=292792 , avg=188.42, stdev=933.04 bw (KB/s) : min= 320, max=63259, per=100.00%, avg=16684.86, stdev=9052.82 cpu : usr=4.60%, sys=13.58%, ctx=344489, majf=0, minf=31 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=0/640575/0, short=0/0/0 lat (usec): 2=4.55%, 4=23.30%, 10=27.49%, 20=5.00%, 50=2.07% lat (usec): 100=0.09%, 250=12.02%, 500=20.41%, 750=3.46%, 1000=0.11% lat (msec): 2=0.31%, 4=0.78%, 10=0.40%, 20=0.01%, 50=0.01% lat (msec): 100=0.01%, 250=0.01%, 500=0.01% sequentiellschreiben: (groupid=1, jobs=1): err= 0: pid=20482 write: io=2048.0MB, bw=49401KB/s, iops=6176 , runt= 42452msec clat (usec): min=0 , max=213632 , avg=132.32, stdev=1355.16 lat (usec): min=1 , max=213632 , avg=132.65, stdev=1355.16 bw (KB/s) : min= 2, max=79902, per=110.53%, avg=54600.95, stdev=24636.03 cpu : usr=10.83%, sys=21.02%, ctx=232933, majf=0, minf=34 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=0/262197/0, short=0/0/0 lat (usec): 2=0.09%, 4=1.07%, 10=10.43%, 20=0.89%, 50=0.07% lat (usec): 100=35.03%, 250=51.57%, 500=0.57%, 750=0.03%, 1000=0.01% lat (msec): 2=0.04%, 4=0.08%, 10=0.02%, 20=0.06%, 50=0.01% lat (msec): 100=0.03%, 250=0.01% zufälliglesen: (groupid=2, jobs=1): err= 0: pid=20484 read : io=2048.0MB, bw=23151KB/s, iops=7050 , runt= 90584msec clat (usec): min=0 , max=70235 , avg=134.92, stdev=212.70 lat (usec): min=0 , max=70236 , avg=135.16, stdev=212.79 bw (KB/s) : min= 5, max=118959, per=100.36%, avg=23233.61, stdev=12885.69 cpu : usr=4.55%, sys=13.30%, ctx=259109, majf=0, minf=27 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=638666/0/0, short=0/0/0 lat (usec): 2=21.65%, 4=30.12%, 10=7.42%, 20=0.69%, 50=0.06% lat (usec): 100=0.01%, 250=14.58%, 500=21.58%, 750=3.85%, 1000=0.01% lat (msec): 2=0.02%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec): 100=0.01% sequentielllesen: (groupid=3, jobs=1): err= 0: pid=20820 read : io=2048.0MB, bw=267392KB/s, iops=33432 , runt= 7843msec clat (usec): min=0 , max=4098 , avg=28.40, stdev=143.49 lat (usec): min=0 , max=4098 , avg=28.51, stdev=143.49 bw (KB/s) : min=176584, max=275993, per=100.04%, avg=267511.13, stdev=25285.86 cpu : usr=4.18%, sys=21.27%, ctx=8616, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=262210/0/0, short=0/0/0 lat (usec): 2=26.04%, 4=47.22%, 10=21.85%, 20=1.61%, 50=0.13% lat (usec): 100=0.02%, 250=0.01%, 500=0.01%, 750=0.14%, 1000=2.96% lat (msec): 2=0.01%, 4=0.01%, 10=0.01% Run status group 0 (all jobs): WRITE: io=2048.0MB, aggrb=16685KB/s, minb=17085KB/s, maxb=17085KB/s, mint=125687msec, maxt=125687msec Run status group 1 (all jobs): WRITE: io=2048.0MB, aggrb=49400KB/s, minb=50586KB/s, maxb=50586KB/s, mint=42452msec, maxt=42452msec Run status group 2 (all jobs): READ: io=2048.0MB, aggrb=23151KB/s, minb=23707KB/s, maxb=23707KB/s, mint=90584msec, maxt=90584msec Run status group 3 (all jobs): READ: io=2048.0MB, aggrb=267391KB/s, minb=273808KB/s, maxb=273808KB/s, mint=7843msec, maxt=7843msec Disk stats (read/write): dm-2: ios=832862/416089, merge=0/0, ticks=206456/6699696, in_queue=6907728, util=78.36%, aggrios=833046/418069, aggrmerge=95/663, aggrticks=206032/6668768, aggrin_queue=6873712, aggrutil=77.80% sda: ios=833046/418069, merge=95/663, ticks=206032/6668768, in_queue=6873712, util=77.80% martin@merkaba:~[...]> diff -u iops.job iops-done-right.job --- iops.job 2011-07-29 16:40:41.776809061 +0200 +++ iops-done-right.job 2011-08-02 16:15:06.055626894 +0200 @@ -19,4 +19,5 @@ stonewall [sequentielllesen] rw=read +stonewall So always question results that don´t make sense. May this serve as pointer should anyone stumple upon something like this ;) Ciao, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
[global] rw=randread size=2g bsrange=2-16k filename=zufälliglesen.1.0 [zufälliglesen] stonewall [sequentielllesen] rw=read
zufälliglesen: (g=0): rw=randread, bs=2-16K/2-16K, ioengine=sync, iodepth=1 sequentielllesen: (g=0): rw=read, bs=2-16K/2-16K, ioengine=sync, iodepth=1 fio 1.57 Starting 2 processes zufälliglesen: (groupid=0, jobs=1): err= 0: pid=14723 read : io=2048.0MB, bw=186066KB/s, iops=56794 , runt= 11271msec clat (usec): min=0 , max=103559 , avg=15.57, stdev=1069.35 lat (usec): min=0 , max=103559 , avg=15.62, stdev=1069.35 bw (KB/s) : min= 243, max=843245, per=53.37%, avg=198607.29, stdev=327979.47 cpu : usr=11.54%, sys=9.26%, ctx=3392, majf=0, minf=28 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=640132/0/0, short=0/0/0 lat (usec): 2=67.88%, 4=25.43%, 10=5.79%, 20=0.36%, 50=0.05% lat (usec): 100=0.01%, 250=0.15%, 500=0.26%, 750=0.04%, 1000=0.01% lat (msec): 2=0.01%, 4=0.03%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec): 100=0.01%, 250=0.01% sequentielllesen: (groupid=0, jobs=1): err= 0: pid=14724 read : io=2048.0MB, bw=255813KB/s, iops=31989 , runt= 8198msec clat (usec): min=0 , max=22658 , avg=30.46, stdev=171.40 lat (usec): min=0 , max=22658 , avg=30.56, stdev=171.39 bw (KB/s) : min=240308, max=264524, per=68.85%, avg=256220.81, stdev=6556.93 cpu : usr=4.39%, sys=17.18%, ctx=8630, majf=0, minf=28 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=262251/0/0, short=0/0/0 lat (usec): 2=26.88%, 4=48.21%, 10=19.94%, 20=1.59%, 50=0.16% lat (usec): 100=0.09%, 250=0.04%, 500=0.05%, 750=0.16%, 1000=2.60% lat (msec): 2=0.21%, 4=0.06%, 50=0.01% Run status group 0 (all jobs): READ: io=4096.0MB, aggrb=372132KB/s, minb=190531KB/s, maxb=261952KB/s, mint=8198msec, maxt=11271msec Disk stats (read/write): dm-2: ios=11567/15, merge=0/0, ticks=23592/508, in_queue=24100, util=79.23%, aggrios=11473/353, aggrmerge=156/14, aggrticks=24556/2332, aggrin_queue=26864, aggrutil=77.77% sda: ios=11473/353, merge=156/14, ticks=24556/2332, in_queue=26864, util=77.77%
[global] rw=randread size=2g bsrange=2-16k #filename=zufälliglesen.1.0 [zufälliglesen] stonewall [sequentielllesen] rw=read
zufälliglesen: (g=0): rw=randread, bs=2-16K/2-16K, ioengine=sync, iodepth=1 sequentielllesen: (g=0): rw=read, bs=2-16K/2-16K, ioengine=sync, iodepth=1 fio 1.57 Starting 2 processes zufälliglesen: (groupid=0, jobs=1): err= 0: pid=14745 read : io=2048.0MB, bw=21516KB/s, iops=6567 , runt= 97469msec clat (usec): min=0 , max=103566 , avg=146.45, stdev=1099.54 lat (usec): min=0 , max=103566 , avg=146.68, stdev=1099.56 bw (KB/s) : min= 132, max=118508, per=49.82%, avg=21437.43, stdev=13479.42 cpu : usr=4.87%, sys=11.20%, ctx=261968, majf=0, minf=26 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=640153/0/0, short=0/0/0 lat (usec): 2=21.55%, 4=30.22%, 10=7.16%, 20=0.58%, 50=0.06% lat (usec): 100=0.01%, 250=15.55%, 500=21.14%, 750=3.68%, 1000=0.01% lat (msec): 2=0.01%, 4=0.03%, 20=0.01%, 50=0.01%, 100=0.01% lat (msec): 250=0.01% sequentielllesen: (groupid=0, jobs=1): err= 0: pid=14746 read : io=2048.0MB, bw=256909KB/s, iops=32115 , runt= 8163msec clat (usec): min=0 , max=3979 , avg=30.33, stdev=166.59 lat (usec): min=0 , max=3979 , avg=30.44, stdev=166.58 bw (KB/s) : min=245618, max=267008, per=597.43%, avg=257087.19, stdev=6569.89 cpu : usr=3.97%, sys=17.30%, ctx=8475, majf=0, minf=28 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=262162/0/0, short=0/0/0 lat (usec): 2=28.30%, 4=47.36%, 10=19.41%, 20=1.64%, 50=0.15% lat (usec): 100=0.06%, 250=0.01%, 500=0.01%, 750=0.11%, 1000=2.69% lat (msec): 2=0.20%, 4=0.06% Run status group 0 (all jobs): READ: io=4096.0MB, aggrb=43032KB/s, minb=22032KB/s, maxb=263075KB/s, mint=8163msec, maxt=97469msec Disk stats (read/write): dm-2: ios=267158/84, merge=0/0, ticks=94852/676, in_queue=95528, util=82.25%, aggrios=267040/962, aggrmerge=164/37, aggrticks=94356/5816, aggrin_queue=99820, aggrutil=81.77% sda: ios=267040/962, merge=164/37, ticks=94356/5816, in_queue=99820, util=81.77%