Hello!
We use Ceph+Openstack in our private cloud. Recently we upgrade our centos6.5 based cluster from Ceph Emperor to Ceph Firefly.
At first,we use redhat yum repo epel to upgrade, this Ceph's version is 0.80.5. First upgrade monitor,then osd,last client. when we complete this upgrade, we boot a VM on the cluster,then use fio to test the io performance. The io performance is as better as before. Everything is ok!
Then we upgrade the cluster from 0.80.5 to 0.80.8,when we completed , we reboot the VM to load the newest librbd. after that we also use fio to test the io performance.then we find the randwrite and write is as good as before.but the randread and read is become worse, randwrite's iops from 4000-5000 to 300-400 ,and the latency is worse. the write's bw from 400MB/s to 115MB/s. then I downgrade the ceph client version from 0.80.8 to 0.80.5, then the reslut become normal.
So I think maybe something cause about librbd. I compare the 0.80.8 release notes with 0.80.5 (http://ceph.com/docs/master/release-notes/#v0-80-8-firefly ), I just find this change in 0.80.8 is something about read request : librbd: cap memory utilization for read requests (Jason Dillaman) . Who can explain this?
My ceph cluster is 400osd,5mons:
ceph -s
health HEALTH_OK
monmap e11: 5 mons at {BJ-M1-Cloud71=172.28.2.71:6789/0,BJ-M1-Cloud73=172.28.2.73:6789/0,BJ-M2-Cloud80=172.28.2.80:6789/0,BJ-M2-Cloud81=172.28.2.81:6789/0,BJ-M3-Cloud85=172.28.2.85:6789/0}, election epoch 198, quorum 0,1,2,3,4 BJ-M1-Cloud71,BJ-M1-Cloud73,BJ-M2-Cloud80,BJ-M2-Cloud81,BJ-M3-Cloud85
osdmap e120157: 400 osds: 400 up, 400 in
pgmap v26161895: 29288 pgs, 6 pools, 20862 GB data, 3014 kobjects
41084 GB used, 323 TB / 363 TB avail
29288 active+clean
client io 52640 kB/s rd, 32419 kB/s wr, 5193 op/s
The follwing is my ceph client conf :
[global]
auth_service_required = cephx
filestore_xattr_use_omap = true
auth_client_required = cephx
auth_cluster_required = cephx
mon_host = 172.29.204.24,172.29.204.48,172.29.204.55,172.29.204.58,172.29.204.73
mon_initial_members = ZR-F5-Cloud24, ZR-F6-Cloud48, ZR-F7-Cloud55, ZR-F8-Cloud58, ZR-F9-Cloud73
fsid = c01c8e28-304e-47a4-b876-cb93acc2e980
mon osd full ratio = .85
mon osd nearfull ratio = .75
public network = 172.29.204.0/24
mon warn on legacy crush tunables = false
[osd]
osd op threads = 12
filestore journal writeahead = true
filestore merge threshold = 40
filestore split multiple = 8
[client]
rbd cache = true
rbd cache writethrough until flush = false
rbd cache size = 67108864
rbd cache max dirty = 50331648
rbd cache target dirty = 33554432
[client.cinder]
admin socket = /var/run/ceph/rbd-$pid.asok
My VM is 8core16G,we use fio scripts is :
fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randread -size=60G -filename=/dev/vdb -name="EBS" -iodepth=32 -runtime=200
fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=60G -filename=/dev/vdb -name="EBS" -iodepth=32 -runtime=200
fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=read -size=60G -filename=/dev/vdb -name="EBS" -iodepth=32 -runtime=200
fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=write -size=60G -filename=/dev/vdb -name="EBS" -iodepth=32 -runtime=200
The following is the io test result
ceph client verison :0.80.5
read: bw=430MB
write: bw=420MB
randread: iops=4875 latency=65ms
randwrite: iops=6844 latency=46ms
ceph client verison :0.80.8
read: bw=115MB
write: bw=480MB
randread: iops=381 latency=83ms
randwrite: iops=4843 latency=68ms
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com