John, thanks for the discussion at LPC. One of the concerns we raised however was the performance test. The numbers below are rather obviously tainted. I think we need to get a better baseline before concluding anything... Here's my main concern: On 11/10/2018 3:50 AM, john.hubbard@xxxxxxxxx wrote:
From: John Hubbard <jhubbard@xxxxxxxxxx> ... ------------------------------------------------------ WITHOUT the patch: ------------------------------------------------------ reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 fio-3.3 Starting 1 process Jobs: 1 (f=1): [R(1)][100.0%][r=55.5MiB/s,w=0KiB/s][r=14.2k,w=0 IOPS][eta 00m:00s] reader: (groupid=0, jobs=1): err= 0: pid=1750: Tue Nov 6 20:18:06 2018 read: IOPS=13.9k, BW=54.4MiB/s (57.0MB/s)(1024MiB/18826msec)
~14000 4KB read IOPS is really, really low for an NVMe disk.
cpu : usr=2.39%, sys=95.30%, ctx=669, majf=0, minf=72
CPU is obviously the limiting factor. At these IOPS, it should be far less.
------------------------------------------------------ OR, here's a better run WITH the patch applied, and you can see that this is nearly as good as the "without" case: ------------------------------------------------------ reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 fio-3.3 Starting 1 process Jobs: 1 (f=1): [R(1)][100.0%][r=53.2MiB/s,w=0KiB/s][r=13.6k,w=0 IOPS][eta 00m:00s] reader: (groupid=0, jobs=1): err= 0: pid=2521: Tue Nov 6 20:01:33 2018 read: IOPS=13.4k, BW=52.5MiB/s (55.1MB/s)(1024MiB/19499msec)
Similar low IOPS.
cpu : usr=3.47%, sys=94.61%, ctx=370, majf=0, minf=73
Similar CPU saturation.
I get nearly 400,000 4KB IOPS on my tiny desktop, which has a 25W i7-7500 and a Samsung PM961 128GB NVMe (stock Bionic 4.15 kernel and fio version 3.1). Even then, the CPU saturates, so it's not necessarily a perfect test. I'd like to see your runs both get to "max" IOPS, i.e. CPU < 100%, and compare the CPU numbers. This would give the best comparison for making a decision. Can you confirm what type of hardware you're running this test on? CPU, memory speed and capacity, and NVMe device especially? Tom.