HI 1. 2 sw iscsi gateways(deploy on osd/monitor ) using lrbd to create,the iscsi target is LIO configuration: { "auth": [ { "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", "authentication": "none" } ], "targets": [ { "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", "hosts": [ { "host": "node2", "portal": "east" }, { "host": "node3", "portal": "west" } ] } ], "portals": [ { "name": "east", "addresses": [ "10.0.52.92" ] }, { "name": "west", "addresses": [ "10.0.52.93" ] } ], "pools": [ { "pool": "rbd", "gateways": [ { "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", "tpg": [ { "image": "testvol" } ] } ] } ] }
2 the ceph cluster itself’s performance is ok. i create a rbd on one of ceph node. fio results is nice: 4K randwrite IOPS=3013 bw=100MB/s. so i think the ceph cluster have no bottleneck.
3 Intel S3510 SSD 480G enterprise not consumer
new test :clone a VM in wmware can reach 100MB/s. but fio and dd test in vm still poor.
Hello,On Fri, 1 Jul 2016 13:04:45 +0800 mq wrote:Hi list I have tested suse enterprise storage3 using 2 iscsi gateway attached to vmware. The performance is bad.
First off, it's somewhat funny that you're testing the repackaged SUSECeph, but asking for help here (with Ceph being owned by Red Hat).Aside from that, you're not telling us what these 2 iSCSI gateways are(SW, HW specs/configuration).Having iSCSI on top of Ceph is by the very nature of things going to beslower than native Ceph.Use "rbd bench" or a VM client with RBD to get a base number of what yourCeph cluster is capable of, this will help identifying where the slowdownis.I have turn off VAAI following the (https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1033665) <https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1033665)>. My cluster 3 ceph nodes :2*E5-2620 64G , mem 2*1Gbps (3*10K SAS, 1*480G SSD) per node, SSD as journal 1 vmware node 2*E5-2620 64G , mem 2*1Gbps
That's a slow (latency wise) network, but not your problem.What SSD model? A 480GB size suggests a consumer model and that would explain a lot.Check you storage nodes with atop during the fio runs and see if you canspot a bottleneck.Christian# ceph -s cluster 0199f68d-a745-4da3-9670-15f2981e7a15 health HEALTH_OK monmap e1: 3 mons at {node1=192.168.50.91:6789/0,node2=192.168.50.92:6789/0,node3=192.168.50.93:6789/0} election epoch 22, quorum 0,1,2 node1,node2,node3 osdmap e200: 9 osds: 9 up, 9 in flags sortbitwise pgmap v1162: 448 pgs, 1 pools, 14337 MB data, 4935 objects 18339 MB used, 5005 GB / 5023 GB avail 448 active+clean client io 87438 kB/s wr, 0 op/s rd, 213 op/s wr
sudo ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 4.90581 root default -2 1.63527 host node1 0 0.54509 osd.0 up 1.00000 1.00000 1 0.54509 osd.1 up 1.00000 1.00000 2 0.54509 osd.2 up 1.00000 1.00000 -3 1.63527 host node2 3 0.54509 osd.3 up 1.00000 1.00000 4 0.54509 osd.4 up 1.00000 1.00000 5 0.54509 osd.5 up 1.00000 1.00000 -4 1.63527 host node3 6 0.54509 osd.6 up 1.00000 1.00000 7 0.54509 osd.7 up 1.00000 1.00000 8 0.54509 osd.8 up 1.00000 1.00000
An linux vm in vmmare, running fio. 4k randwrite result just 64 IOPS lantency is high,dd test just 11MB/s. fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=100G -filename=/dev/sdb -name="EBS 4KB randwrite test" -iodepth=32 -runtime=60 EBS 4KB randwrite test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32 fio-2.0.13 Starting 1 thread Jobs: 1 (f=1): [w] [100.0% done] [0K/131K/0K /s] [0 /32 /0 iops] [eta 00m:00s] EBS 4KB randwrite test: (groupid=0, jobs=1): err= 0: pid=6766: Wed Jun 29 21:28:06 2016 write: io=15696KB, bw=264627 B/s, iops=64 , runt= 60737msec slat (usec): min=10 , max=213 , avg=35.54, stdev=16.41 clat (msec): min=1 , max=31368 , avg=495.01, stdev=1862.52 lat (msec): min=2 , max=31368 , avg=495.04, stdev=1862.52 clat percentiles (msec): | 1.00th=[ 7], 5.00th=[ 8], 10.00th=[ 8], 20.00th=[ 9], | 30.00th=[ 9], 40.00th=[ 10], 50.00th=[ 198], 60.00th=[ 204], | 70.00th=[ 208], 80.00th=[ 217], 90.00th=[ 799], 95.00th=[ 1795], | 99.00th=[ 7177], 99.50th=[12649], 99.90th=[16712], 99.95th=[16712], | 99.99th=[16712] bw (KB/s) : min= 36, max=11960, per=100.00%, avg=264.77, stdev=1110.81 lat (msec) : 2=0.03%, 4=0.23%, 10=40.93%, 20=0.48%, 50=0.03% lat (msec) : 100=0.08%, 250=39.55%, 500=5.63%, 750=2.91%, 1000=1.35% lat (msec) : 2000=4.03%, >=2000=4.77% cpu : usr=0.02%, sys=0.22%, ctx=2973, majf=0, minf=18446744073709538907 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=0.4%, 32=99.2%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued : total=r=0/w=3924/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=15696KB, aggrb=258KB/s, minb=258KB/s, maxb=258KB/s, mint=60737msec, maxt=60737msec Disk stats (read/write): sdb: ios=83/3921, merge=0/0, ticks=60/1903085, in_queue=1931694, util=100.00%
anyone can give me some suggestion to improve the performance ?
Regards
MQ
-- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communicationshttp://www.gol.com/
|