Can you give some more insight about the ceph cluster you are running ? It seems IO started and then no response..cur MB/s is becoming 0s.. What is ‘ceph –s’ output ? Hope all the OSDs are up and running.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx]
On Behalf Of changqian zuo Hi, guys, We have been running an OpenStack Havana environment with Ceph 0.72.2 as block storage backend. Recently we were trying to upgrade OpenStack to Juno. For testing, we deployed a Juno all-in-one node, this node share the same Cinder volume
rbd pool and Glance image rbd pool with the old Havana. And after some test, we found a serious read performance problem in Juno client (write is just OK), something like: # rados bench -p test 30 seq sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 1 16 100 84 335.843 336 0.020221 0.0393582 2 16 100 84 167.944 0 - 0.0393582 3 16 100 84 111.967 0 - 0.0393582 4 16 100 84 83.9769 0 - 0.0393582 5 16 100 84 67.1826 0 - 0.0393582 6 16 100 84 55.9863 0 - 0.0393582 7 16 100 84 47.9886 0 - 0.0393582 8 16 100 84 41.9905 0 - 0.0393582 9 16 100 84 37.3249 0 - 0.0393582 10 16 100 84 33.5926 0 - 0.0393582 11 16 100 84 30.5388 0 - 0.0393582 12 16 100 84 27.9938 0 - 0.0393582 13 16 100 84 25.8405 0 - 0.0393582 14 16 100 84 23.9948 0 - 0.0393582 15 16 100 84 22.3952 0 - 0.0393582 And when testing RBD image with fio (bs=512k read), there are: # grep 12067 ceph.client.log | grep read 2015-05-11 16:19:36.649554 7ff9949d5a00 1 --
10.10.11.15:0/2012449 --> 10.10.11.21:6835/45746 -- osd_op(client.3772684.0:12067 rbd_data.262a6e7bf17801.0000000000000003 [sparse-read 2621440~524288] 7.c43a3ae3 e240302) v4 -- ?+0 0x7ff9967c5fb0 con 0x7ff99a41c420 2015-05-11 16:20:07.709915 7ff94bfff700 1 --
10.10.11.15:0/2012449 <== osd.218 10.10.11.21:6835/45746 111 ==== osd_op_reply(12067 rbd_data.262a6e7bf17801.0000000000000003 [sparse-read 2621440~524288] v0'0 uv3803266 _ondisk_ = 0) v6 ==== 199+0+524312 (3484234903
0 0) 0x7ff3a4002ba0 con 0x7ff99a41c420 Some operation takes more an minute. I checked OSD log (default logging level,
ceph.com said when a request takes too long, it will complain in log), and do see some slow 4k write request, but no read. We have tested Giant, Firefly, and self-built Emperor client, same sad results. The network between OSD and all-in-one node is 10Gb network, this is from client to OSD: # iperf3 -c 10.10.11.25 -t 60 -i 1 Connecting to host 10.10.11.25, port 5201 [ 4] local 10.10.11.15 port 41202 connected to 10.10.11.25 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 1.09 GBytes 9.32 Gbits/sec 11 2.02 MBytes [ 4] 1.00-2.00 sec 1.09 GBytes 9.35 Gbits/sec 34 1.53 MBytes [ 4] 2.00-3.00 sec 1.09 GBytes 9.35 Gbits/sec 11 1.14 MBytes [ 4] 3.00-4.00 sec 1.09 GBytes 9.37 Gbits/sec 0 1.22 MBytes [ 4] 4.00-5.00 sec 1.09 GBytes 9.34 Gbits/sec 0 1.27 MBytes and OSD to client (there may be some problem in client interface bonding, 10Gb could not by reached): # iperf3 -c 10.10.11.15 -t 60 -i 1 Connecting to host 10.10.11.15, port 5201 [ 4] local 10.10.11.25 port 43934 connected to 10.10.11.15 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 400 MBytes 3.35 Gbits/sec 1 337 KBytes [ 4] 1.00-2.00 sec 553 MBytes 4.63 Gbits/sec 1 341 KBytes [ 4] 2.00-3.00 sec 390 MBytes 3.27 Gbits/sec 1 342 KBytes [ 4] 3.00-4.00 sec 395 MBytes 3.32 Gbits/sec 0 342 KBytes [ 4] 4.00-5.00 sec 541 MBytes 4.54 Gbits/sec 0 346 KBytes [ 4] 5.00-6.00 sec 405 MBytes 3.40 Gbits/sec 0 358 KBytes [ 4] 6.00-7.00 sec 728 MBytes 6.11 Gbits/sec 1 370 KBytes [ 4] 7.00-8.00 sec 741 MBytes 6.22 Gbits/sec 0 355 KBytes Ceph cluster is shared by this Juno and old Havana (as mentioned, they use exactly same rbd pool), and IO on Havana just goes fine. Any suggestion or advice? So that we can make sure it is an issue of client, network, or ceph cluster and
then go on. I am new to Ceph, need some help. Thanks PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com