Re: [PATCH v7] block: Improve IOPS by removing the fairness code

Bart Van Assche <bvanassche@xxxxxxx> · Thu, 30 May 2024 14:02:20 -0700

On 5/30/24 13:47, Keith Busch wrote:
I suggested running a more lopsided workload on a high contention tag
set: here's an example fio profile to exaggerate this:

---
[global]
rw=randread
direct=1
ioengine=io_uring
time_based
runtime=60
ramp_time=10

[zero]
bs=131072
filename=/dev/nvme0n1
iodepth=256
iodepth_batch_submit=64
iodepth_batch_complete=64

[one]
bs=512
filename=/dev/nvme0n2
iodepth=1
--

My test nvme device has 2 namespaces, 1 IO queue, and only 63 tags.

Without your patch:

   zero: (groupid=0, jobs=1): err= 0: pid=465: Thu May 30 13:29:43 2024
     read: IOPS=14.0k, BW=1749MiB/s (1834MB/s)(103GiB/60002msec)
        lat (usec): min=2937, max=40980, avg=16990.33, stdev=1732.37
   ...
   one: (groupid=0, jobs=1): err= 0: pid=466: Thu May 30 13:29:43 2024
     read: IOPS=2726, BW=1363KiB/s (1396kB/s)(79.9MiB/60001msec)
        lat (usec): min=45, max=4859, avg=327.52, stdev=335.25

With your patch:

   zero: (groupid=0, jobs=1): err= 0: pid=341: Thu May 30 13:36:26 2024
     read: IOPS=14.8k, BW=1852MiB/s (1942MB/s)(109GiB/60004msec)
        lat (usec): min=3103, max=26191, avg=16322.77, stdev=1138.04
   ...
   one: (groupid=0, jobs=1): err= 0: pid=342: Thu May 30 13:36:26 2024
     read: IOPS=1841, BW=921KiB/s (943kB/s)(54.0MiB/60001msec)
        lat (usec): min=51, max=5935, avg=503.81, stdev=608.41

So there's definitely a difference here that harms the lesser used
device for a modest gain on the higher demanding device. Does it matter?
I really don't know if I can answer that. It's just different is all I'm
saying.

Hi Keith,

Thank you for having run this test. I propose that users who want better
fairness than what my patch supports use an appropriate mechanism for
improving fairness (e.g. blk-iocost or blk-iolat). This leaves the choice
between maximum performance and maximum fairness to the user. Does this
sound good to you?

Thanks,

Bart.