Re: Single threaded IOPS on SSD pool.

Eneko Lacunza <elacunza@xxxxxxxxx> · Wed, 5 Jun 2019 17:39:18 +0200

Hi,

El 5/6/19 a las 16:53, vitalif@xxxxxxxxxx escribió:
Ok, average network latency from VM to OSD's ~0.4ms.

It's rather bad, you can improve the latency by 0.3ms just by 
upgrading the network.

Single threaded performance ~500-600 IOPS - or average latency of 1.6ms
Is that comparable to what other are seeing?

Good "reference" numbers are 0.5ms for reads (~2000 iops) and 1ms for 
writes (~1000 iops).

I confirm that the most powerful thing to do is disabling CPU 
powersave (governor=performance + cpupower -D 0). You usually get 2x 
single thread iops at once.

We have a small cluster with 4 OSD host, each with 1 SSD INTEL 
SSDSC2KB019T8 (D3-S4510 1.8T), connected with a 10G network (shared with 
VMs, not a busy cluster). Volumes are replica 3:

Network latency from one node to the other 3:
10 packets transmitted, 10 received, 0% packet loss, time 9166ms
rtt min/avg/max/mdev = 0.042/0.064/0.088/0.013 ms

10 packets transmitted, 10 received, 0% packet loss, time 9190ms
rtt min/avg/max/mdev = 0.047/0.072/0.110/0.017 ms

10 packets transmitted, 10 received, 0% packet loss, time 9219ms
rtt min/avg/max/mdev = 0.061/0.078/0.099/0.011 ms

You fio test on a 4-core VM:

$ fio fio-job-randr.ini
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=1
fio-3.12
Starting 1 process
test: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [r(1)][100.0%][r=10.3MiB/s][r=2636 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=4056: Wed Jun  5 17:14:33 2019
  Description  : [fio random 4k reads]
  read: IOPS=2386, BW=9544KiB/s (9773kB/s)(559MiB/60001msec)
    slat (nsec): min=0, max=616576, avg=10847.27, stdev=3253.55
    clat (nsec): min=0, max=10346k, avg=406536.60, stdev=145643.92
     lat (nsec): min=0, max=10354k, avg=417653.11, stdev=145740.26
    clat percentiles (usec):
     |  1.00th=[   37],  5.00th=[  202], 10.00th=[  258], 20.00th=[ 318],
     | 30.00th=[  351], 40.00th=[  383], 50.00th=[  416], 60.00th=[ 445],
     | 70.00th=[  474], 80.00th=[  502], 90.00th=[  545], 95.00th=[ 578],
     | 99.00th=[  701], 99.50th=[  742], 99.90th=[ 1004], 99.95th=[ 1500],
     | 99.99th=[ 3752]
   bw (  KiB/s): min=    0, max=10640, per=100.00%, avg=9544.13, 
stdev=486.02, samples=120
   iops        : min=    0, max= 2660, avg=2386.03, stdev=121.50, 
samples=120
  lat (usec)   : 2=0.01%, 50=2.94%, 100=0.17%, 250=6.20%, 500=70.34%
  lat (usec)   : 750=19.92%, 1000=0.33%
  lat (msec)   : 2=0.07%, 4=0.03%, 10=0.01%, 20=0.01%
  cpu          : usr=1.01%, sys=3.44%, ctx=143387, majf=0, minf=16
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     issued rwts: total=143163,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=9544KiB/s (9773kB/s), 9544KiB/s-9544KiB/s 
(9773kB/s-9773kB/s), io=559MiB (586MB), run=60001-60001msec

Disk stats (read/write):
    dm-0: ios=154244/120, merge=0/0, ticks=63120/12, in_queue=63128, 
util=96.98%, aggrios=154244/58, aggrmerge=0/62, aggrticks=63401/40, 
aggrin_queue=62800, aggrutil=96.42%
  sda: ios=154244/58, merge=0/62, ticks=63401/40, in_queue=62800, 
util=96.42%

So if I read correctly, about 2500 IOPS read. I see governor=performance 
(out of the box on Proxmox VE I think). We touched cpupower, at least 
not from beyond what does our distribution (Proxmox VE).

For reference, the same test with random write (KVM disk cache is 
write-back):

$ fio fio-job-randw.ini
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=1
fio-3.12
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][w=35.5MiB/s][w=9077 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=4278: Wed Jun  5 17:35:51 2019
  Description  : [fio random 4k writes]
  write: IOPS=9809, BW=38.3MiB/s (40.2MB/s)(2299MiB/60001msec); 0 zone 
resets
    slat (nsec): min=0, max=856527, avg=13669.16, stdev=5257.21
    clat (nsec): min=0, max=256305k, avg=86123.12, stdev=913448.71
     lat (nsec): min=0, max=256328k, avg=100145.33, stdev=913512.45
    clat percentiles (usec):
     |  1.00th=[   37],  5.00th=[   41], 10.00th=[   46], 20.00th=[   54],
     | 30.00th=[   60], 40.00th=[   65], 50.00th=[   71], 60.00th=[   78],
     | 70.00th=[   86], 80.00th=[   96], 90.00th=[  119], 95.00th=[ 151],
     | 99.00th=[  251], 99.50th=[  297], 99.90th=[  586], 99.95th=[ 857],
     | 99.99th=[ 4490]
   bw (  KiB/s): min=    0, max=52392, per=100.00%, avg=39243.27, 
stdev=3553.88, samples=119
   iops        : min=    0, max=13098, avg=9810.81, stdev=888.47, 
samples=119
  lat (nsec)   : 1000=0.01%
  lat (usec)   : 2=0.02%, 4=0.01%, 10=0.01%, 20=0.01%, 50=15.44%
  lat (usec)   : 100=67.16%, 250=16.36%, 500=0.90%, 750=0.06%, 1000=0.03%
  lat (msec)   : 2=0.02%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  lat (msec)   : 250=0.01%, 500=0.01%
  cpu          : usr=6.50%, sys=17.61%, ctx=588616, majf=0, minf=16
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
     issued rwts: total=0,588596,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=38.3MiB/s (40.2MB/s), 38.3MiB/s-38.3MiB/s 
(40.2MB/s-40.2MB/s), io=2299MiB (2411MB), run=60001-60001msec

Disk stats (read/write):
    dm-0: ios=0/638056, merge=0/0, ticks=0/58752, in_queue=58740, 
util=85.10%, aggrios=0/640374, aggrmerge=0/52, aggrticks=0/59650, 
aggrin_queue=56472, aggrutil=82.40%
  sda: ios=0/640374, merge=0/52, ticks=0/59650, in_queue=56472, util=82.40%

Cheers

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com