Re: [PATCH RFC 0/2] block,nvme: latency-based I/O scheduler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/28/24 11:38, Sagi Grimberg wrote:


On 26/03/2024 17:35, Hannes Reinecke wrote:
Hi all,

there had been several attempts to implement a latency-based I/O
scheduler for native nvme multipath, all of which had its issues.

So time to start afresh, this time using the QoS framework
already present in the block layer.
It consists of two parts:
- a new 'blk-nodelat' QoS module, which is just a simple per-node
   latency tracker
- a 'latency' nvme I/O policy

Using the 'tiobench' fio script I'm getting:
   WRITE: bw=531MiB/s (556MB/s), 33.2MiB/s-52.4MiB/s
   (34.8MB/s-54.9MB/s), io=4096MiB (4295MB), run=4888-7718msec
     WRITE: bw=539MiB/s (566MB/s), 33.7MiB/s-50.9MiB/s
   (35.3MB/s-53.3MB/s), io=4096MiB (4295MB), run=5033-7594msec
      READ: bw=898MiB/s (942MB/s), 56.1MiB/s-75.4MiB/s
   (58.9MB/s-79.0MB/s), io=4096MiB (4295MB), run=3397-4560msec
      READ: bw=1023MiB/s (1072MB/s), 63.9MiB/s-75.1MiB/s
   (67.0MB/s-78.8MB/s), io=4096MiB (4295MB), run=3408-4005msec

for 'round-robin' and

   WRITE: bw=574MiB/s (601MB/s), 35.8MiB/s-45.5MiB/s
   (37.6MB/s-47.7MB/s), io=4096MiB (4295MB), run=5629-7142msec
     WRITE: bw=639MiB/s (670MB/s), 39.9MiB/s-47.5MiB/s
   (41.9MB/s-49.8MB/s), io=4096MiB (4295MB), run=5388-6408msec
      READ: bw=1024MiB/s (1074MB/s), 64.0MiB/s-73.7MiB/s
   (67.1MB/s-77.2MB/s), io=4096MiB (4295MB), run=3475-4000msec
      READ: bw=1013MiB/s (1063MB/s), 63.3MiB/s-72.6MiB/s
   (66.4MB/s-76.2MB/s), io=4096MiB (4295MB), run=3524-4042msec
for 'latency' with 'decay' set to 10.
That's on a 32G FC testbed running against a brd target,
fio running with 16 thread.

Can you quantify the improvement? Also, the name latency suggest
that latency should be improved no?

'latency' refers to 'latency-based' I/O scheduler, ie it selects
the path with the least latency. It does not necessarily _improve_
the latency. Eg for truly symmetric fabrics it doesn't.
It _does_ improve matters when running on asymmetric fabrics
(eg on a two socket system with two PCI HBAs, each connected to one
socket, or like the example above with one path via 'loop', and the
other via 'tcp' and address '127.0.0.1').
And, of course, if you have congested fabrics, where it should be
able to direct I/O to the least congested path.

But I'll see to extract the latency numbers, too.

What I really wanted to show is that we _can_ track latency without
harming performance.

Cheers,

Hannes





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux