Re: suse_enterprise_storage3_rbd_LIO_vmware_performance_bad

mq <maoqi1982@xxxxxxx> · Fri, 1 Jul 2016 17:28:06 +0800

HI  1.
2 sw  iscsi gateways(deploy on osd/monitor ) using lrbd to create,the iscsi target is LIO 
configuration:
{
  "auth": [
    {
      "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", 
      "authentication": "none"
    }
  ], 
  "targets": [
    {
      "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", 
      "hosts": [
        {
          "host": "node2", 
          "portal": "east"
        }, 
        {
          "host": "node3", 
          "portal": "west"
        }
      ]
    }
  ], 
  "portals": [
    {
      "name": "east", 
      "addresses": [
        "10.0.52.92"
      ]
    }, 
    {
      "name": "west", 
      "addresses": [
        "10.0.52.93"
      ]
    }
  ], 
  "pools": [
    {
      "pool": "rbd", 
      "gateways": [
        {
          "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", 
          "tpg": [
            {
              "image": "testvol"
            }
          ]
        }
      ]
    }
  ]
}

2 the ceph cluster itself’s performance is ok. i create a rbd on one of ceph node. fio results is nice: 4K randwrite IOPS=3013 bw=100MB/s.
so i think the ceph cluster have no bottleneck.

3  Intel S3510 SSD 480G enterprise not consumer

new test :clone a VM in wmware can reach 100MB/s. but fio and dd test in vm still poor.

在 2016年7月1日，下午4:18，Christian Balzer <chibi@xxxxxxx> 写道：

Hello,

On Fri, 1 Jul 2016 13:04:45 +0800 mq wrote:

Hi list
I have tested suse enterprise storage3 using 2 iscsi  gateway attached
to  vmware. The performance is bad.  

First off, it's somewhat funny that you're testing the repackaged SUSE
Ceph, but asking for help here (with Ceph being owned by Red Hat).

Aside from that, you're not telling us what these 2 iSCSI gateways are
(SW, HW specs/configuration).

Having iSCSI on top of Ceph is by the very nature of things going to be
slower than native Ceph.

Use "rbd bench" or a VM client with RBD to get a base number of what your
Ceph cluster is capable of, this will help identifying where the slowdown
is.

I have turn off  VAAI following the
(https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1033665)
<https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1033665)>.
My cluster 3 ceph nodes :2*E5-2620 64G , mem 2*1Gbps (3*10K SAS, 1*480G
SSD) per node, SSD as journal 1 vmware node  2*E5-2620 64G , mem 2*1Gbps 

That's a slow (latency wise) network, but not your problem.
What SSD model? 
A 480GB size suggests a consumer model and that would explain a lot.

Check you storage nodes with atop during the fio runs and see if you can
spot a bottleneck.

Christian

# ceph -s
   cluster 0199f68d-a745-4da3-9670-15f2981e7a15
    health HEALTH_OK
    monmap e1: 3 mons at
{node1=192.168.50.91:6789/0,node2=192.168.50.92:6789/0,node3=192.168.50.93:6789/0}
election epoch 22, quorum 0,1,2 node1,node2,node3 osdmap e200: 9 osds: 9
up, 9 in flags sortbitwise
     pgmap v1162: 448 pgs, 1 pools, 14337 MB data, 4935 objects
           18339 MB used, 5005 GB / 5023 GB avail
                448 active+clean
 client io 87438 kB/s wr, 0 op/s rd, 213 op/s wr

sudo ceph osd tree
ID WEIGHT  TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 4.90581 root default
-2 1.63527     host node1
0 0.54509         osd.0       up  1.00000          1.00000
1 0.54509         osd.1       up  1.00000          1.00000
2 0.54509         osd.2       up  1.00000          1.00000
-3 1.63527     host node2
3 0.54509         osd.3       up  1.00000          1.00000
4 0.54509         osd.4       up  1.00000          1.00000
5 0.54509         osd.5       up  1.00000          1.00000
-4 1.63527     host node3
6 0.54509         osd.6       up  1.00000          1.00000
7 0.54509         osd.7       up  1.00000          1.00000
8 0.54509         osd.8       up  1.00000          1.00000

An linux vm in vmmare， running fio.  4k randwrite result just 64 IOPS
lantency is high，dd test just 11MB／s. 
fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=100G
-filename=/dev/sdb  -name="EBS 4KB randwrite test" -iodepth=32
-runtime=60 EBS 4KB randwrite test: (g=0): rw=randwrite,
bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32 fio-2.0.13 Starting 1
thread Jobs: 1 (f=1): [w] [100.0% done] [0K/131K/0K /s] [0 /32 /0  iops]
[eta 00m:00s] EBS 4KB randwrite test: (groupid=0, jobs=1): err= 0:
pid=6766: Wed Jun 29 21:28:06 2016 write: io=15696KB, bw=264627 B/s,
iops=64 , runt= 60737msec slat (usec): min=10 , max=213 , avg=35.54,
stdev=16.41 clat (msec): min=1 , max=31368 , avg=495.01, stdev=1862.52
    lat (msec): min=2 , max=31368 , avg=495.04, stdev=1862.52
   clat percentiles (msec):
    |  1.00th=[    7],  5.00th=[    8], 10.00th=[    8],
20.00th=[    9], | 30.00th=[    9], 40.00th=[   10], 50.00th=[  198],
60.00th=[  204], | 70.00th=[  208], 80.00th=[  217], 90.00th=[  799],
95.00th=[ 1795], | 99.00th=[ 7177], 99.50th=[12649], 99.90th=[16712],
99.95th=[16712], | 99.99th=[16712]
   bw (KB/s)  : min=   36, max=11960, per=100.00%, avg=264.77,
stdev=1110.81 lat (msec) : 2=0.03%, 4=0.23%, 10=40.93%, 20=0.48%,
50=0.03% lat (msec) : 100=0.08%, 250=39.55%, 500=5.63%, 750=2.91%,
1000=1.35% lat (msec) : 2000=4.03%, >=2000=4.77%
 cpu          : usr=0.02%, sys=0.22%, ctx=2973, majf=0,
minf=18446744073709538907 IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%,
16=0.4%, 32=99.2%, >=64=0.0% submit    : 0=0.0%, 4=100.0%, 8=0.0%,
16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete  : 0=0.0%, 4=100.0%,
8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued    :
total=r=0/w=3924/d=0, short=r=0/w=0/d=0 
Run status group 0 (all jobs):
 WRITE: io=15696KB, aggrb=258KB/s, minb=258KB/s, maxb=258KB/s,
mint=60737msec, maxt=60737msec 
Disk stats (read/write):
 sdb: ios=83/3921, merge=0/0, ticks=60/1903085, in_queue=1931694,
util=100.00%

anyone can give me some suggestion to improve the performance ?

Regards

MQ

-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com