On 2/4/2019 12:21 AM, john.hubbard@xxxxxxxxx wrote:
From: John Hubbard <jhubbard@xxxxxxxxxx> Performance: here is an fio run on an NVMe drive, using this for the fio configuration file: [reader] direct=1 ioengine=libaio blocksize=4096 size=1g numjobs=1 rw=read iodepth=64 reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 fio-3.3 Starting 1 process Jobs: 1 (f=1) reader: (groupid=0, jobs=1): err= 0: pid=7011: Sun Feb 3 20:36:51 2019 read: IOPS=190k, BW=741MiB/s (778MB/s)(1024MiB/1381msec) slat (nsec): min=2716, max=57255, avg=4048.14, stdev=1084.10 clat (usec): min=20, max=12485, avg=332.63, stdev=191.77 lat (usec): min=22, max=12498, avg=336.72, stdev=192.07 clat percentiles (usec): | 1.00th=[ 322], 5.00th=[ 322], 10.00th=[ 322], 20.00th=[ 326], | 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326], | 70.00th=[ 326], 80.00th=[ 330], 90.00th=[ 330], 95.00th=[ 330], | 99.00th=[ 478], 99.50th=[ 717], 99.90th=[ 1074], 99.95th=[ 1090], | 99.99th=[12256]
These latencies are concerning. The best results we saw at the end of November (previous approach) were MUCH flatter. These really start spiking at three 9's, and are sky-high at four 9's. The "stdev" values for clat and lat are about 10 times the previous. There's some kind of serious queuing contention here, that wasn't there in November.
bw ( KiB/s): min=730152, max=776512, per=99.22%, avg=753332.00, stdev=32781.47, samples=2 iops : min=182538, max=194128, avg=188333.00, stdev=8195.37, samples=2 lat (usec) : 50=0.01%, 100=0.01%, 250=0.07%, 500=99.26%, 750=0.38% lat (usec) : 1000=0.02% lat (msec) : 2=0.24%, 20=0.02% cpu : usr=15.07%, sys=84.13%, ctx=10, majf=0, minf=74
System CPU 84% is roughly double the November results of 45%. Ouch. Did you re-run the baseline on the new unpatched base kernel and can we see the before/after? Tom.
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): READ: bw=741MiB/s (778MB/s), 741MiB/s-741MiB/s (778MB/s-778MB/s), io=1024MiB (1074MB), run=1381-1381msec Disk stats (read/write): nvme0n1: ios=216966/0, merge=0/0, ticks=6112/0, in_queue=704, util=91.34%