RE: running jobs serially

Vincent Fu <vincent.fu@xxxxxxxxxxx> · Wed, 18 May 2022 22:41:24 +0000

> -----Original Message-----
> From: Antoine Beaupré [mailto:anarcat@xxxxxxxxxxxxxx]
> Sent: Wednesday, May 18, 2022 3:45 PM
> To: fio@xxxxxxxxxxxxxxx
> Subject: running jobs serially
> 
> Hi,
> 
> One of the things I've been struggling for with fio for a while is how
> to run batches of jobs with it.
> 
> I know I can call fio multiple times with different job files or
> parameters, that's easy. But what I'd like to do is have a *single* job
> file (or even multiple, actually) that would describe *multiple*
> workloads that would need to be tested.
> 
> In particular, I'm looking for a way to reproduce the benchmarks
> suggested here:
> 
> https://arstechnica.com/gadgets/2020/02/how-fast-are-your-disks-find-
> out-the-open-source-way-with-fio/
> 
> ... without having to write all the glue the author had to make here:
> 
> https://protect2.fireeye.com/v1/url?k=a07448a5-c1ff5de2-a075c3ea-
> 000babff99aa-0e92de5a06afec7e&q=1&e=72adeb5d-5707-4c64-bfbd-
> d6433a957054&u=https%3A%2F%2Fgithub.com%2Fjimsalterjrs%2Ffio-
> test-scaffolding%2F
> 
> ... which is quite a bit of goo.
> 
> I was hoping a simple thing like this would just do it:
> 
> [global]
> # cargo-culting Salter
> fallocate=none
> ioengine=posixaio
> runtime=60
> time_based=1
> end_fsync=1
> stonewall=1
> group_reporting=1
> 
> # Single 4KiB random read/write process
> [randread-4k-4g-1x]
> stonewall=1
> rw=randread
> bs=4k
> size=4g
> numjobs=1
> iodepth=1
> 
> [randwrite-4k-4g-1x]
> stonewall=1
> rw=randwrite
> bs=4k
> size=4g
> numjobs=1
> iodepth=1
> 
> ... but looking at the "normal" --output-format, it *looks* like the
> jobs are all started at the same time. The files certainly seem to be
> allocated all at once:
> 
> root@curie:/home# fio ars.fio
> randread-4k-4g-1x: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-
> 4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
> randwrite-4k-4g-1x: (g=1): rw=randwrite, bs=(R) 4096B-4096B, (W)
> 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
> fio-3.25
> Starting 2 processes
> Jobs: 1 (f=1): [r(1),P(1)][0.0%][r=16.5MiB/s][r=4228 IOPS][eta
> 49710d:06h:28m:14s]
> randread-4k-4g-1x: (groupid=0, jobs=1): err= 0: pid=1033470: Wed May
> 18 15:41:04 2022
>   read: IOPS=4754, BW=18.6MiB/s (19.5MB/s)(18.6MiB/1001msec)
>     slat (nsec): min=1429, max=391678, avg=3239.76, stdev=6013.78
>     clat (usec): min=163, max=6917, avg=205.05, stdev=108.47
>      lat (usec): min=166, max=7308, avg=208.29, stdev=114.12
>     clat percentiles (usec):
>      |  1.00th=[  169],  5.00th=[  174], 10.00th=[  174], 20.00th=[  178],
>      | 30.00th=[  182], 40.00th=[  184], 50.00th=[  196], 60.00th=[  200],
>      | 70.00th=[  204], 80.00th=[  215], 90.00th=[  239], 95.00th=[  269],
>      | 99.00th=[  412], 99.50th=[  478], 99.90th=[  635], 99.95th=[ 1045],
>      | 99.99th=[ 6915]
>    bw (  KiB/s): min=19248, max=19248, per=100.00%, avg=19248.00,
> stdev= 0.00, samples=1
>    iops        : min= 4812, max= 4812, avg=4812.00, stdev= 0.00, samples=1
>   lat (usec)   : 250=92.79%, 500=6.81%, 750=0.34%
>   lat (msec)   : 2=0.04%, 10=0.02%
>   cpu          : usr=2.90%, sys=2.90%, ctx=4767, majf=0, minf=44
>   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      issued rwts: total=4759,0,0,0 short=0,0,0,0 dropped=0,0,0,0
>      latency   : target=0, window=0, percentile=100.00%, depth=1
> randwrite-4k-4g-1x: (groupid=1, jobs=1): err= 0: pid=1033477: Wed May
> 18 15:41:04 2022
>   write: IOPS=32.3k, BW=126MiB/s (132MB/s)(174MiB/1378msec); 0 zone
> resets
>     slat (nsec): min=955, max=326042, avg=2726.11, stdev=2538.38
>     clat (nsec): min=343, max=6896.5k, avg=18139.12, stdev=69899.75
>      lat (usec): min=10, max=6899, avg=20.87, stdev=70.02
>     clat percentiles (usec):
>      |  1.00th=[   11],  5.00th=[   11], 10.00th=[   12], 20.00th=[   12],
>      | 30.00th=[   13], 40.00th=[   13], 50.00th=[   14], 60.00th=[   15],
>      | 70.00th=[   16], 80.00th=[   18], 90.00th=[   26], 95.00th=[   34],
>      | 99.00th=[   62], 99.50th=[   91], 99.90th=[  231], 99.95th=[  326],
>      | 99.99th=[ 4047]
>    bw (  KiB/s): min=196064, max=196064, per=100.00%, avg=196064.00,
> stdev= 0.00, samples=1
>    iops        : min=49016, max=49016, avg=49016.00, stdev= 0.00, samples=1
>   lat (nsec)   : 500=0.01%, 750=0.01%
>   lat (usec)   : 2=0.01%, 4=0.01%, 10=0.21%, 20=83.60%, 50=14.51%
>   lat (usec)   : 100=1.22%, 250=0.37%, 500=0.03%, 1000=0.01%
>   lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%
>   cpu          : usr=11.84%, sys=18.01%, ctx=46292, majf=0, minf=46
>   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      issued rwts: total=0,44457,0,0 short=0,0,0,0 dropped=0,0,0,0
>      latency   : target=0, window=0, percentile=100.00%, depth=1
> 
> Run status group 0 (all jobs):
>    READ: bw=18.6MiB/s (19.5MB/s), 18.6MiB/s-18.6MiB/s (19.5MB/s-
> 19.5MB/s), io=18.6MiB (19.5MB), run=1001-1001msec
> 
> Run status group 1 (all jobs):
>   WRITE: bw=126MiB/s (132MB/s), 126MiB/s-126MiB/s (132MB/s-
> 132MB/s), io=174MiB (182MB), run=1378-1378msec
> 
> Disk stats (read/write):
>     dm-2: ios=4759/43132, merge=0/0, ticks=864/7281132,
> in_queue=7281996, util=67.32%, aggrios=4759/43181, aggrmerge=0/0,
> aggrticks=864/7378584, aggrin_queue=7379448, aggrutil=67.17%
>     dm-0: ios=4759/43181, merge=0/0, ticks=864/7378584,
> in_queue=7379448, util=67.17%, aggrios=4759/43124, aggrmerge=0/57,
> aggrticks=778/8680, aggrin_queue=9487, aggrutil=67.02%
>   sda: ios=4759/43124, merge=0/57, ticks=778/8680, in_queue=9487,
> util=67.02%
> 
> 
> Those timestamps, specifically, should not be the same:
> 
> randwrite-4k-4g-1x: (groupid=1, jobs=1): err= 0: pid=1033477: Wed May
> 18 15:41:04 2022
> randread-4k-4g-1x: (groupid=0, jobs=1): err= 0: pid=1033470: Wed May
> 18 15:41:04 2022
> 
> Am I missing something? Or are job files just *not* designed to run
> things serially?
> 
> I looked in the archives for this, and only found this (unfulfilled,
> AFAICT) request:
> 
> https://lore.kernel.org/fio/CANvN+emA01TZfbBx4aU+gg5CKfy+AEX_gZ
> W7Jz4HMHvwkdBNoQ@xxxxxxxxxxxxxx/
> 
> and:
> 
> https://lore.kernel.org/fio/MWHPR04MB0320ED986E73B1E9994929B38F
> 470@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> 
> ... but that talks about serialize_overlap which seems to be specific to
> handling requests sent in parallel, not serializing jobs themselves.
> 
> For now, it feels like i need to revert to shell scripts and that's kind
> of a little annoying: it would be really nice to be able to carry a full
> workfload in a single job file.
> 
> Thanks, and sorry if that's a dumb question. :)
> 
> --
> Antoine Beaupré
> torproject.org system administration

The jobs you are running have the *stonewall* option which should make them run
serially unless something is very broken. Here is documentation for the
stonewall option:

https://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-stonewall

You could add the write_bw_log=filename and log_unix_epoch=1 options to
confirm. You should see a timestamp for each IO and should be able to make
sure that all the writes are happening after the reads.