Reduced latency is killing performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

this really feels like a follow-up to the discussion we've had in Santa
Fe, but finally I'm able to substantiate it with some numbers.

I've made a patch to enable the megaraid_sas driver for multiqueue.
While this is pretty straightforward (I'll be sending the patchset later
on), the results are ... interesting.

I've run the 'ssd-test.fio' script from Jens' repository, and these
results for MQ/SQ (- is mq, + is sq):

 Run status group 0 (all jobs):
-   READ: io=10641MB, aggrb=181503KB/s, minb=181503KB/s,
maxb=181503KB/s, mint=60033msec, maxt=60033msec
+   READ: io=18370MB, aggrb=312572KB/s, minb=312572KB/s,
maxb=312572KB/s, mint=60181msec, maxt=60181msec

 Run status group 1 (all jobs):
-   READ: io=441444KB, aggrb=7303KB/s, minb=7303KB/s, maxb=7303KB/s,
mint=60443msec, maxt=60443msec
+   READ: io=223108KB, aggrb=3707KB/s, minb=3707KB/s, maxb=3707KB/s,
mint=60182msec, maxt=60182msec

 Run status group 2 (all jobs):
-  WRITE: io=22485MB, aggrb=383729KB/s, minb=383729KB/s,
maxb=383729KB/s, mint=60001msec, maxt=60001msec
+  WRITE: io=47421MB, aggrb=807581KB/s, minb=807581KB/s,
maxb=807581KB/s, mint=60129msec, maxt=60129msec

 Run status group 3 (all jobs):
-  WRITE: io=489852KB, aggrb=8110KB/s, minb=8110KB/s, maxb=8110KB/s,
mint=60399msec, maxt=60399msec
+  WRITE: io=489748KB, aggrb=8134KB/s, minb=8134KB/s, maxb=8134KB/s,
mint=60207msec, maxt=60207msec

 Disk stats (read/write):
-  sda: ios=2834412/5878578, merge=0/0, ticks=86269292/48364836,
in_queue=135345876, util=99.20%
+  sda: ios=205278/2680329, merge=4552593/9580622,
ticks=12539912/12965228, in_queue=25512312, util=99.59%

As you can see, we're really losing performance in the multiqueue case.
And the main reason for that is that we submit about _10 times_ as much
I/O as we do for the single-queue case.

So I guess having an I/O scheduler is critical, even for the scsi-mq case.

Full fio output attached.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@xxxxxxx			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
# Do some important numbers on SSD drives, to gauge what kind of
# performance you might get out of them.
#
# Sequential read and write speeds are tested, these are expected to be
# high. Random reads should also be fast, random writes are where crap
# drives are usually separated from the good drives.
#
# This uses a queue depth of 4. New SATA SSD's will support up to 32
# in flight commands, so it may also be interesting to increase the queue
# depth and compare. Note that most real-life usage will not see that
# large of a queue depth, so 4 is more representative of normal use.
#
[global]
ioengine=libaio
iodepth=256
size=10g
direct=1
runtime=60
numjobs=16
directory=/home
filename=ssd.test.file
group_reporting

[seq-read]
rw=read
stonewall

[rand-read]
rw=randread
stonewall

[seq-write]
rw=write
stonewall

[rand-write]
rw=randwrite
stonewall

seq-read: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
rand-read: (g=1): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
seq-write: (g=2): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
rand-write: (g=3): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
fio-2.15
Starting 64 processes

seq-read: (groupid=0, jobs=16): err= 0: pid=3715: Thu Nov 10 16:48:49 2016
  read : io=18370MB, bw=312573KB/s, iops=78143, runt= 60181msec
    slat (usec): min=1, max=5664, avg= 2.33, stdev= 5.74
    clat (usec): min=30, max=1283.3K, avg=52320.04, stdev=114862.40
     lat (usec): min=37, max=1283.3K, avg=52322.47, stdev=114862.35
    clat percentiles (usec):
     |  1.00th=[  620],  5.00th=[ 1432], 10.00th=[ 2096], 20.00th=[ 2544],
     | 30.00th=[ 2608], 40.00th=[ 2640], 50.00th=[ 2672], 60.00th=[ 2736],
     | 70.00th=[ 2864], 80.00th=[14912], 90.00th=[284672], 95.00th=[342016],
     | 99.00th=[419840], 99.50th=[468992], 99.90th=[585728], 99.95th=[634880],
     | 99.99th=[749568]
    lat (usec) : 50=0.01%, 100=0.01%, 250=0.20%, 500=0.47%, 750=1.07%
    lat (usec) : 1000=0.90%
    lat (msec) : 2=5.92%, 4=63.81%, 10=5.22%, 20=3.28%, 50=1.84%
    lat (msec) : 100=1.19%, 250=4.25%, 500=11.54%, 750=0.28%, 1000=0.01%
    lat (msec) : 2000=0.01%
  cpu          : usr=0.65%, sys=1.42%, ctx=78762, majf=0, minf=9372
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=4702738/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=256
rand-read: (groupid=1, jobs=16): err= 0: pid=3772: Thu Nov 10 16:48:49 2016
  read : io=223108KB, bw=3707.3KB/s, iops=926, runt= 60182msec
    slat (usec): min=2, max=1131.3K, avg=17217.00, stdev=95963.21
    clat (msec): min=16, max=5492, avg=4255.60, stdev=671.40
     lat (msec): min=171, max=6020, avg=4272.81, stdev=675.11
    clat percentiles (msec):
     |  1.00th=[  717],  5.00th=[ 3195], 10.00th=[ 4080], 20.00th=[ 4228],
     | 30.00th=[ 4359], 40.00th=[ 4359], 50.00th=[ 4424], 60.00th=[ 4424],
     | 70.00th=[ 4490], 80.00th=[ 4490], 90.00th=[ 4555], 95.00th=[ 4752],
     | 99.00th=[ 4817], 99.50th=[ 4948], 99.90th=[ 5473], 99.95th=[ 5473],
     | 99.99th=[ 5473]
    lat (msec) : 20=0.01%, 100=0.02%, 250=0.24%, 500=0.40%, 750=0.40%
    lat (msec) : 1000=0.46%, 2000=1.61%, >=2000=96.87%
  cpu          : usr=0.02%, sys=0.04%, ctx=1929, majf=0, minf=4259
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=0.5%, 32=0.9%, >=64=98.2%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=55777/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=256
seq-write: (groupid=2, jobs=16): err= 0: pid=3790: Thu Nov 10 16:48:49 2016
  write: io=47421MB, bw=807582KB/s, iops=201895, runt= 60129msec
    slat (usec): min=2, max=3039, avg= 4.85, stdev=16.02
    clat (usec): min=95, max=794322, avg=20251.85, stdev=67558.84
     lat (usec): min=123, max=794326, avg=20256.83, stdev=67559.50
    clat percentiles (usec):
     |  1.00th=[  900],  5.00th=[  932], 10.00th=[  956], 20.00th=[  996],
     | 30.00th=[ 1048], 40.00th=[ 1112], 50.00th=[ 1160], 60.00th=[ 1224],
     | 70.00th=[ 1400], 80.00th=[ 1720], 90.00th=[ 2960], 95.00th=[232448],
     | 99.00th=[296960], 99.50th=[313344], 99.90th=[366592], 99.95th=[489472],
     | 99.99th=[774144]
    lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.09%, 1000=20.47%
    lat (msec) : 2=64.34%, 4=5.85%, 10=0.24%, 20=0.33%, 50=0.81%
    lat (msec) : 100=0.54%, 250=3.00%, 500=4.26%, 750=0.02%, 1000=0.03%
  cpu          : usr=1.41%, sys=5.44%, ctx=558403, majf=0, minf=12550
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=0/w=12139774/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=256
rand-write: (groupid=3, jobs=16): err= 0: pid=3806: Thu Nov 10 16:48:49 2016
  write: io=489748KB, bw=8134.5KB/s, iops=2033, runt= 60207msec
    slat (usec): min=3, max=330352, avg=7843.31, stdev=43990.65
    clat (msec): min=65, max=3671, avg=2001.23, stdev=448.67
     lat (msec): min=68, max=3671, avg=2009.07, stdev=449.18
    clat percentiles (msec):
     |  1.00th=[  285],  5.00th=[  693], 10.00th=[ 1680], 20.00th=[ 1991],
     | 30.00th=[ 2040], 40.00th=[ 2089], 50.00th=[ 2114], 60.00th=[ 2147],
     | 70.00th=[ 2180], 80.00th=[ 2245], 90.00th=[ 2343], 95.00th=[ 2376],
     | 99.00th=[ 2540], 99.50th=[ 2606], 99.90th=[ 2704], 99.95th=[ 2737],
     | 99.99th=[ 2868]
    lat (msec) : 100=0.08%, 250=0.68%, 500=1.72%, 750=3.60%, 1000=1.28%
    lat (msec) : 2000=14.04%, >=2000=78.60%
  cpu          : usr=0.03%, sys=0.14%, ctx=11009, majf=0, minf=172
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=0.4%, >=64=99.2%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=0/w=122437/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
   READ: io=18370MB, aggrb=312572KB/s, minb=312572KB/s, maxb=312572KB/s, mint=60181msec, maxt=60181msec

Run status group 1 (all jobs):
   READ: io=223108KB, aggrb=3707KB/s, minb=3707KB/s, maxb=3707KB/s, mint=60182msec, maxt=60182msec

Run status group 2 (all jobs):
  WRITE: io=47421MB, aggrb=807581KB/s, minb=807581KB/s, maxb=807581KB/s, mint=60129msec, maxt=60129msec

Run status group 3 (all jobs):
  WRITE: io=489748KB, aggrb=8134KB/s, minb=8134KB/s, maxb=8134KB/s, mint=60207msec, maxt=60207msec

Disk stats (read/write):
  sda: ios=205278/2680329, merge=4552593/9580622, ticks=12539912/12965228, in_queue=25512312, util=99.59%
seq-read: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
rand-read: (g=1): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
seq-write: (g=2): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
rand-write: (g=3): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
fio-2.15
Starting 64 processes

seq-read: (groupid=0, jobs=16): err= 0: pid=12365: Thu Nov 10 16:34:03 2016
  read : io=10641MB, bw=181503KB/s, iops=45375, runt= 60033msec
    slat (usec): min=2, max=168902, avg=349.71, stdev=2179.65
    clat (usec): min=27, max=1046.3K, avg=89851.03, stdev=94205.19
     lat (usec): min=921, max=1046.3K, avg=90200.97, stdev=94519.88
    clat percentiles (msec):
     |  1.00th=[    6],  5.00th=[   11], 10.00th=[   16], 20.00th=[   23],
     | 30.00th=[   32], 40.00th=[   43], 50.00th=[   57], 60.00th=[   77],
     | 70.00th=[  101], 80.00th=[  139], 90.00th=[  210], 95.00th=[  289],
     | 99.00th=[  445], 99.50th=[  506], 99.90th=[  644], 99.95th=[  709],
     | 99.99th=[  881]
    lat (usec) : 50=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.22%, 10=3.90%, 20=12.47%, 50=29.05%
    lat (msec) : 100=23.95%, 250=23.49%, 500=6.38%, 750=0.51%, 1000=0.03%
    lat (msec) : 2000=0.01%
  cpu          : usr=0.38%, sys=1.53%, ctx=1261134, majf=0, minf=7465
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=2724047/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=256
rand-read: (groupid=1, jobs=16): err= 0: pid=12424: Thu Nov 10 16:34:03 2016
  read : io=441444KB, bw=7303.5KB/s, iops=1825, runt= 60443msec
    slat (usec): min=3, max=5654.6K, avg=8705.16, stdev=98890.89
    clat (usec): min=499, max=54071K, avg=2020668.34, stdev=4066596.80
     lat (usec): min=534, max=54071K, avg=2029373.76, stdev=4080684.63
    clat percentiles (msec):
     |  1.00th=[   58],  5.00th=[   96], 10.00th=[  120], 20.00th=[  163],
     | 30.00th=[  219], 40.00th=[  310], 50.00th=[  482], 60.00th=[  881],
     | 70.00th=[ 1598], 80.00th=[ 2868], 90.00th=[ 5604], 95.00th=[ 8455],
     | 99.00th=[16712], 99.50th=[16712], 99.90th=[16712], 99.95th=[16712],
     | 99.99th=[16712]
    lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.03%, 20=0.08%, 50=0.47%
    lat (msec) : 100=5.07%, 250=28.49%, 500=16.49%, 750=6.91%, 1000=4.65%
    lat (msec) : 2000=11.35%, >=2000=26.44%
  cpu          : usr=0.03%, sys=0.10%, ctx=12943, majf=0, minf=4261
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=0.5%, >=64=99.1%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=110361/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=256
seq-write: (groupid=2, jobs=16): err= 0: pid=12440: Thu Nov 10 16:34:03 2016
  write: io=22485MB, bw=383729KB/s, iops=95932, runt= 60001msec
    slat (usec): min=4, max=25089, avg=160.99, stdev=186.82
    clat (usec): min=9, max=99208, avg=42511.29, stdev=8094.51
     lat (usec): min=285, max=99223, avg=42672.98, stdev=8120.87
    clat percentiles (usec):
     |  1.00th=[27008],  5.00th=[35072], 10.00th=[35584], 20.00th=[37120],
     | 30.00th=[37632], 40.00th=[38656], 50.00th=[40192], 60.00th=[41728],
     | 70.00th=[43776], 80.00th=[47872], 90.00th=[54528], 95.00th=[60160],
     | 99.00th=[67072], 99.50th=[70144], 99.90th=[76288], 99.95th=[78336],
     | 99.99th=[84480]
    lat (usec) : 10=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.02%, 20=0.33%, 50=83.21%
    lat (msec) : 100=16.42%
  cpu          : usr=3.17%, sys=19.97%, ctx=10291638, majf=0, minf=28791
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=0/w=5756037/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=256
rand-write: (groupid=3, jobs=16): err= 0: pid=12456: Thu Nov 10 16:34:03 2016
  write: io=489852KB, bw=8110.3KB/s, iops=2027, runt= 60399msec
    slat (usec): min=4, max=7321.2K, avg=7849.80, stdev=103311.36
    clat (msec): min=30, max=49117, avg=2007.24, stdev=3917.47
     lat (msec): min=30, max=49143, avg=2015.09, stdev=3929.12
    clat percentiles (msec):
     |  1.00th=[   73],  5.00th=[  120], 10.00th=[  147], 20.00th=[  194],
     | 30.00th=[  253], 40.00th=[  322], 50.00th=[  469], 60.00th=[  766],
     | 70.00th=[ 1352], 80.00th=[ 2704], 90.00th=[ 5473], 95.00th=[ 9372],
     | 99.00th=[16712], 99.50th=[16712], 99.90th=[16712], 99.95th=[16712],
     | 99.99th=[16712]
    lat (msec) : 50=0.18%, 100=2.31%, 250=27.05%, 500=22.01%, 750=8.03%
    lat (msec) : 1000=4.76%, 2000=11.62%, >=2000=24.04%
  cpu          : usr=0.05%, sys=0.16%, ctx=23792, majf=0, minf=181
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=0.4%, >=64=99.2%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=0/w=122463/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
   READ: io=10641MB, aggrb=181503KB/s, minb=181503KB/s, maxb=181503KB/s, mint=60033msec, maxt=60033msec

Run status group 1 (all jobs):
   READ: io=441444KB, aggrb=7303KB/s, minb=7303KB/s, maxb=7303KB/s, mint=60443msec, maxt=60443msec

Run status group 2 (all jobs):
  WRITE: io=22485MB, aggrb=383729KB/s, minb=383729KB/s, maxb=383729KB/s, mint=60001msec, maxt=60001msec

Run status group 3 (all jobs):
  WRITE: io=489852KB, aggrb=8110KB/s, minb=8110KB/s, maxb=8110KB/s, mint=60399msec, maxt=60399msec

Disk stats (read/write):
  sda: ios=2834412/5878578, merge=0/0, ticks=86269292/48364836, in_queue=135345876, util=99.20%

[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux