Re: fio question

M Kelly <mckelly2833@xxxxxxxxx> · Thu, 29 Dec 2022 16:51:16 -0500

Hi,

ok, thanks.
I am speculating here ... the random pattern (--rw=randread) in the
threaded+rr case is not the same as separate processes - and we are
getting a lot more page-cache re-use.
But no matter, I know enough now to collect the i/o perf info we need.
TY again for fio!

take care,
-mark

On Thu, Dec 29, 2022 at 4:12 PM Vincent Fu <vincentfu@xxxxxxxxx> wrote:
>
> On 12/29/22 13:16, M Kelly wrote:
> > Hi,
> >
> > Thank you for your reply and that info, I really appreciate it.
> > One last question, if you have time -
> >
> > I incorrectly thought the cmdline options:
> >
> >      --numjobs=8 --filename=1:2:3:4:5:6:7:8 --size=X (or perhaps --filesize=X)
> >
> > would have each of the 8 threads doing i/o independently and concurrently.
> >
> > Q: Is there a way to use a single command-line to get X threads doing
> > i/o in parallel to separate files that it is similar to my 8 separate
> > cmdlines ?
> >
> > thx again, and wishing you a happy 2023
> > -mark
> >
>
> Oh, I missed that in your original message. numjobs=8 will create 8
> independent jobs that read from the 8 files in a round robin fashion. I
> would still expect performance to differ from 8 jobs reading from
> separate files.
>
> > On Thu, Dec 29, 2022 at 9:28 AM Vincent Fu <vincentfu@xxxxxxxxx> wrote:
> >>
> >> On 12/27/22 18:50, M Kelly wrote:
> >>> Hi,
> >>>
> >>> Thank you for fio.
> >>> Apologies if this is the wrong place to ask.
> >>>
> >>> I am running fio on a laptop with an nvme ssd and have a question
> >>> about differences between results.
> >>>
> >>> test.1 - test.8 are each 900 Mb files of random data created beforehand
> >>>
> >>> run 1 -
> >>>
> >>> echo 3 | sudo tee /proc/sys/vm/drop_caches
> >>>
> >>> fio --name=fiotest
> >>> --filename=test.1:test.2:test.3:test.4:test.5:test.6:test.7:test.8
> >>> -gtod_reduce=1 --size=900M --rw=randread --bs=8K --direct=0
> >>> --numjobs=8 --invalidate=0 --ioengine=psync
> >>>
> >>> results:
> >>>
> >>>      iops        : min= 6794, max=86407, avg=26989.80, stdev=33609.35, samples=5
> >>>      iops        : min= 6786, max=84998, avg=26818.40, stdev=32972.35, samples=5
> >>>      iops        : min= 6816, max=83273, avg=26422.20, stdev=32200.02, samples=5
> >>>      iops        : min= 6838, max=88227, avg=27462.20, stdev=34405.56, samples=5
> >>>      iops        : min= 6868, max=85788, avg=26870.00, stdev=33332.10, samples=5
> >>>      iops        : min= 6818, max=84768, avg=26719.60, stdev=32880.23, samples=5
> >>>      iops        : min= 6810, max=86326, avg=27005.60, stdev=33581.89, samples=5
> >>>      iops        : min= 6856, max=81411, avg=26165.00, stdev=31345.85, samples=5
> >>>      READ: bw=2644MiB/s (2773MB/s), 331MiB/s-332MiB/s (347MB/s-348MB/s),
> >>> io=7200MiB (7550MB), run=2709-2723msec
> >>>
> >>> run 2 -
> >>>
> >>> echo 3 | sudo tee /proc/sys/vm/drop_caches
> >>>
> >>> fio --name=fiotest1 --filename=test.1 -gtod_reduce=1 --size=900M
> >>> --rw=randread --bs=8K --direct=0 --numjobs=1 --invalidate=0
> >>> --ioengine=psync &
> >>> fio --name=fiotest2 --filename=test.2 -gtod_reduce=1 --size=900M
> >>> --rw=randread --bs=8K --direct=0 --numjobs=1 --invalidate=0
> >>> --ioengine=psync &
> >>> fio --name=fiotest3 --filename=test.3 -gtod_reduce=1 --size=900M
> >>> --rw=randread --bs=8K --direct=0 --numjobs=1 --invalidate=0
> >>> --ioengine=psync &
> >>> fio --name=fiotest4 --filename=test.4 -gtod_reduce=1 --size=900M
> >>> --rw=randread --bs=8K --direct=0 --numjobs=1 --invalidate=0
> >>> --ioengine=psync &
> >>> fio --name=fiotest5 --filename=test.5 -gtod_reduce=1 --size=900M
> >>> --rw=randread --bs=8K --direct=0 --numjobs=1 --invalidate=0
> >>> --ioengine=psync &
> >>> fio --name=fiotest6 --filename=test.6 -gtod_reduce=1 --size=900M
> >>> --rw=randread --bs=8K --direct=0 --numjobs=1 --invalidate=0
> >>> --ioengine=psync &
> >>> fio --name=fiotest7 --filename=test.7 -gtod_reduce=1 --size=900M
> >>> --rw=randread --bs=8K --direct=0 --numjobs=1 --invalidate=0
> >>> --ioengine=psync &
> >>> fio --name=fiotest8 --filename=test.8 -gtod_reduce=1 --size=900M
> >>> --rw=randread --bs=8K --direct=0 --numjobs=1 --invalidate=0
> >>> --ioengine=psync &
> >>> wait
> >>>
> >>> results (from each) -
> >>>
> >>>     iops        : min= 5836, max= 6790, avg=6481.37, stdev=140.90, samples=35
> >>>     READ: bw=50.6MiB/s (53.1MB/s), 50.6MiB/s-50.6MiB/s
> >>> (53.1MB/s-53.1MB/s), io=900MiB (944MB), run=17777-17777msec
> >>>     iops        : min= 5924, max= 6784, avg=6476.54, stdev=126.85, samples=35
> >>>     READ: bw=50.6MiB/s (53.1MB/s), 50.6MiB/s-50.6MiB/s
> >>> (53.1MB/s-53.1MB/s), io=900MiB (944MB), run=17785-17785msec
> >>>     iops        : min= 5878, max= 6766, avg=6475.23, stdev=138.57, samples=35
> >>>     READ: bw=50.6MiB/s (53.0MB/s), 50.6MiB/s-50.6MiB/s
> >>> (53.0MB/s-53.0MB/s), io=900MiB (944MB), run=17792-17792msec
> >>>     iops        : min= 5866, max= 6788, avg=6478.74, stdev=139.18, samples=35
> >>>     READ: bw=50.6MiB/s (53.1MB/s), 50.6MiB/s-50.6MiB/s
> >>> (53.1MB/s-53.1MB/s), io=900MiB (944MB), run=17782-17782msec
> >>>     iops        : min= 5890, max= 6782, avg=6477.91, stdev=135.21, samples=35
> >>>     READ: bw=50.6MiB/s (53.1MB/s), 50.6MiB/s-50.6MiB/s
> >>> (53.1MB/s-53.1MB/s), io=900MiB (944MB), run=17783-17783msec
> >>>     iops        : min= 5838, max= 6790, avg=6450.11, stdev=137.17, samples=35
> >>>     READ: bw=50.3MiB/s (52.7MB/s), 50.3MiB/s-50.3MiB/s
> >>> (52.7MB/s-52.7MB/s), io=900MiB (944MB), run=17892-17892msec
> >>>     iops        : min= 5922, max= 6750, avg=6454.06, stdev=125.76, samples=35
> >>>     READ: bw=50.3MiB/s (52.8MB/s), 50.3MiB/s-50.3MiB/s
> >>> (52.8MB/s-52.8MB/s), io=900MiB (944MB), run=17880-17880msec
> >>>     iops        : min= 5916, max= 6724, avg=6465.94, stdev=127.54, samples=35
> >>>     READ: bw=50.5MiB/s (52.0MB/s), 50.5MiB/s-50.5MiB/s
> >>> (52.0MB/s-52.0MB/s), io=900MiB (944MB), run=17816-17816msec
> >>>
> >>> Question: if I am really doing 900 MiB of random reads from 8 separate
> >>> files in both tests, and the page-cache was empty before each test,
> >>> does this difference in performance make sense or am I reading the
> >>> results incorrectly ?
> >>>
> >>> thank you for any info/ advice, suggestions,
> >>> -mark
> >>
> >>
> >> Your first test directs fio to create a single job that reads from the
> >> files in a round robin fashion.
> >>
> >> The second test has eight independent jobs reading separately from the
> >> eight files as fast as they can.
> >>
> >> It's not a surprise that the results are different.
> >>
> >> Vincent
>