> -----Original Message----- > From: Antoine Beaupré [mailto:anarcat@xxxxxxxxxxxxxx] > Sent: Wednesday, May 18, 2022 3:45 PM > To: fio@xxxxxxxxxxxxxxx > Subject: running jobs serially > > Hi, > > One of the things I've been struggling for with fio for a while is how > to run batches of jobs with it. > > I know I can call fio multiple times with different job files or > parameters, that's easy. But what I'd like to do is have a *single* job > file (or even multiple, actually) that would describe *multiple* > workloads that would need to be tested. > > In particular, I'm looking for a way to reproduce the benchmarks > suggested here: > > https://arstechnica.com/gadgets/2020/02/how-fast-are-your-disks-find- > out-the-open-source-way-with-fio/ > > ... without having to write all the glue the author had to make here: > > https://protect2.fireeye.com/v1/url?k=a07448a5-c1ff5de2-a075c3ea- > 000babff99aa-0e92de5a06afec7e&q=1&e=72adeb5d-5707-4c64-bfbd- > d6433a957054&u=https%3A%2F%2Fgithub.com%2Fjimsalterjrs%2Ffio- > test-scaffolding%2F > > ... which is quite a bit of goo. > > I was hoping a simple thing like this would just do it: > > [global] > # cargo-culting Salter > fallocate=none > ioengine=posixaio > runtime=60 > time_based=1 > end_fsync=1 > stonewall=1 > group_reporting=1 > > # Single 4KiB random read/write process > [randread-4k-4g-1x] > stonewall=1 > rw=randread > bs=4k > size=4g > numjobs=1 > iodepth=1 > > [randwrite-4k-4g-1x] > stonewall=1 > rw=randwrite > bs=4k > size=4g > numjobs=1 > iodepth=1 > > ... but looking at the "normal" --output-format, it *looks* like the > jobs are all started at the same time. The files certainly seem to be > allocated all at once: > > root@curie:/home# fio ars.fio > randread-4k-4g-1x: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B- > 4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1 > randwrite-4k-4g-1x: (g=1): rw=randwrite, bs=(R) 4096B-4096B, (W) > 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1 > fio-3.25 > Starting 2 processes > Jobs: 1 (f=1): [r(1),P(1)][0.0%][r=16.5MiB/s][r=4228 IOPS][eta > 49710d:06h:28m:14s] > randread-4k-4g-1x: (groupid=0, jobs=1): err= 0: pid=1033470: Wed May > 18 15:41:04 2022 > read: IOPS=4754, BW=18.6MiB/s (19.5MB/s)(18.6MiB/1001msec) > slat (nsec): min=1429, max=391678, avg=3239.76, stdev=6013.78 > clat (usec): min=163, max=6917, avg=205.05, stdev=108.47 > lat (usec): min=166, max=7308, avg=208.29, stdev=114.12 > clat percentiles (usec): > | 1.00th=[ 169], 5.00th=[ 174], 10.00th=[ 174], 20.00th=[ 178], > | 30.00th=[ 182], 40.00th=[ 184], 50.00th=[ 196], 60.00th=[ 200], > | 70.00th=[ 204], 80.00th=[ 215], 90.00th=[ 239], 95.00th=[ 269], > | 99.00th=[ 412], 99.50th=[ 478], 99.90th=[ 635], 99.95th=[ 1045], > | 99.99th=[ 6915] > bw ( KiB/s): min=19248, max=19248, per=100.00%, avg=19248.00, > stdev= 0.00, samples=1 > iops : min= 4812, max= 4812, avg=4812.00, stdev= 0.00, samples=1 > lat (usec) : 250=92.79%, 500=6.81%, 750=0.34% > lat (msec) : 2=0.04%, 10=0.02% > cpu : usr=2.90%, sys=2.90%, ctx=4767, majf=0, minf=44 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, > >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=4759,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=1 > randwrite-4k-4g-1x: (groupid=1, jobs=1): err= 0: pid=1033477: Wed May > 18 15:41:04 2022 > write: IOPS=32.3k, BW=126MiB/s (132MB/s)(174MiB/1378msec); 0 zone > resets > slat (nsec): min=955, max=326042, avg=2726.11, stdev=2538.38 > clat (nsec): min=343, max=6896.5k, avg=18139.12, stdev=69899.75 > lat (usec): min=10, max=6899, avg=20.87, stdev=70.02 > clat percentiles (usec): > | 1.00th=[ 11], 5.00th=[ 11], 10.00th=[ 12], 20.00th=[ 12], > | 30.00th=[ 13], 40.00th=[ 13], 50.00th=[ 14], 60.00th=[ 15], > | 70.00th=[ 16], 80.00th=[ 18], 90.00th=[ 26], 95.00th=[ 34], > | 99.00th=[ 62], 99.50th=[ 91], 99.90th=[ 231], 99.95th=[ 326], > | 99.99th=[ 4047] > bw ( KiB/s): min=196064, max=196064, per=100.00%, avg=196064.00, > stdev= 0.00, samples=1 > iops : min=49016, max=49016, avg=49016.00, stdev= 0.00, samples=1 > lat (nsec) : 500=0.01%, 750=0.01% > lat (usec) : 2=0.01%, 4=0.01%, 10=0.21%, 20=83.60%, 50=14.51% > lat (usec) : 100=1.22%, 250=0.37%, 500=0.03%, 1000=0.01% > lat (msec) : 2=0.01%, 4=0.01%, 10=0.01% > cpu : usr=11.84%, sys=18.01%, ctx=46292, majf=0, minf=46 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, > >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=0,44457,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=1 > > Run status group 0 (all jobs): > READ: bw=18.6MiB/s (19.5MB/s), 18.6MiB/s-18.6MiB/s (19.5MB/s- > 19.5MB/s), io=18.6MiB (19.5MB), run=1001-1001msec > > Run status group 1 (all jobs): > WRITE: bw=126MiB/s (132MB/s), 126MiB/s-126MiB/s (132MB/s- > 132MB/s), io=174MiB (182MB), run=1378-1378msec > > Disk stats (read/write): > dm-2: ios=4759/43132, merge=0/0, ticks=864/7281132, > in_queue=7281996, util=67.32%, aggrios=4759/43181, aggrmerge=0/0, > aggrticks=864/7378584, aggrin_queue=7379448, aggrutil=67.17% > dm-0: ios=4759/43181, merge=0/0, ticks=864/7378584, > in_queue=7379448, util=67.17%, aggrios=4759/43124, aggrmerge=0/57, > aggrticks=778/8680, aggrin_queue=9487, aggrutil=67.02% > sda: ios=4759/43124, merge=0/57, ticks=778/8680, in_queue=9487, > util=67.02% > > > Those timestamps, specifically, should not be the same: > > randwrite-4k-4g-1x: (groupid=1, jobs=1): err= 0: pid=1033477: Wed May > 18 15:41:04 2022 > randread-4k-4g-1x: (groupid=0, jobs=1): err= 0: pid=1033470: Wed May > 18 15:41:04 2022 > > Am I missing something? Or are job files just *not* designed to run > things serially? > > I looked in the archives for this, and only found this (unfulfilled, > AFAICT) request: > > https://lore.kernel.org/fio/CANvN+emA01TZfbBx4aU+gg5CKfy+AEX_gZ > W7Jz4HMHvwkdBNoQ@xxxxxxxxxxxxxx/ > > and: > > https://lore.kernel.org/fio/MWHPR04MB0320ED986E73B1E9994929B38F > 470@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > ... but that talks about serialize_overlap which seems to be specific to > handling requests sent in parallel, not serializing jobs themselves. > > For now, it feels like i need to revert to shell scripts and that's kind > of a little annoying: it would be really nice to be able to carry a full > workfload in a single job file. > > Thanks, and sorry if that's a dumb question. :) > > -- > Antoine Beaupré > torproject.org system administration The jobs you are running have the *stonewall* option which should make them run serially unless something is very broken. Here is documentation for the stonewall option: https://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-stonewall You could add the write_bw_log=filename and log_unix_epoch=1 options to confirm. You should see a timestamp for each IO and should be able to make sure that all the writes are happening after the reads.