Re: NVME performance regression in Linux 5.x due to lack of block level IO queueing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Good call -- Turns out that that cache issue is resolved in 5.17. I tried a number of kernels and narrowed it down to a problem that started after 4.9 and before 4.15, and ended some time after 5.13. Namely, 4.9 is good, 4.15 is bad, 5.13 is bad, and 5.17 is good. I did not bisect it all the way down to the specific versions where the behaviors changed.

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme1n1       2758.00 2783.00  11032.00  11132.00     0.00     0.00   0.00   0.00    0.10    0.03   0.36     4.00     4.00   0.18 100.00
nvme0n1       2830.00 2875.00  11320.00  11500.00     0.00     0.00   0.00   0.00    0.10    0.03   0.39     4.00     4.00   0.18 100.00

With regards to the performance between 4.4.0 and 5.17, for a single thread, 4.4.0 still had better performance over 5.17. However, the 5.17 kernel was significantly better at multiple threads. In fact, it is so much better I don't believe the results (10x improvement!). Is this to be expected that a single thread would be slower in 5.17, but recent improvements make it possible to run many of them in parallel more efficiently?

# /usr/local/bin/fio -name=randrw -filename=/opt/foo -direct=1 -iodepth=1 -thread -rw=randrw -ioengine=psync -bs=4k -size=10G -numjobs=16 -group_reporting=1 -runtime=120

// Ubuntu 16.04 / Linux 4.4.0:
Run status group 0 (all jobs):
   READ: bw=54.5MiB/s (57.1MB/s), 54.5MiB/s-54.5MiB/s (57.1MB/s-57.1MB/s), io=6537MiB (6854MB), run=120002-120002msec
  WRITE: bw=54.5MiB/s (57.2MB/s), 54.5MiB/s-54.5MiB/s (57.2MB/s-57.2MB/s), io=6544MiB (6862MB), run=120002-120002msec

// Ubuntu 18.04 / Linux 5.4.0:
Run status group 0 (all jobs):
   READ: bw=23.5MiB/s (24.7MB/s), 23.5MiB/s-23.5MiB/s (24.7MB/s-24.7MB/s), io=2821MiB (2959MB), run=120002-120002msec
  WRITE: bw=23.5MiB/s (24.6MB/s), 23.5MiB/s-23.5MiB/s (24.6MB/s-24.6MB/s), io=2819MiB (2955MB), run=120002-120002msec

// Ubuntu 18.04 / Linux 5.17:
Run status group 0 (all jobs):
   READ: bw=244MiB/s (255MB/s), 244MiB/s-244MiB/s (255MB/s-255MB/s), io=28.6GiB (30.7GB), run=120001-120001msec
  WRITE: bw=244MiB/s (256MB/s), 244MiB/s-244MiB/s (256MB/s-256MB/s), io=28.6GiB (30.7GB), run=120001-120001msec

Thanks,
Michael




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux