Ceph Performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,

I am new to Ceph and still learning its performance capabilities, but I would like to share my performance results in the hope that they are useful to others, and also to see if there is room for improvement in my setup.

Firstly, a little about my setup:

3 servers (quad-core CPU, 16GB RAM), each with 4 SATA 7.2K RPM disks (4TB) plus a 160GB SSD.

I have mapped a 10GB volume to a 4th server which is acting as a ceph client. Due to Ceph's thin-provisioning, I used "dd" to write to the entire block device to ensure that the ceph volume is fully allocated. DD writes sequentially at around 95MB/sec which shows the network can run at full capacity.

Each device is connected by a single 1gbps ethernet link to a switch.

I then used "fio" to benchmark the raw block device. The reason for this is that I also need to compare ceph against a traditional iscsi SAN and the internal "rados bench" tools cannot be used for this.

The replication level for the pool I am testing against is 2.

I have tried two setups with regards to the OSD's - firstly with the journal running on a partition on the SSD, and secondly by using "bcache" (http://bcache.evilpiepirate.org) to provide a write-back cache of the 4TB drives.

In all tests, fio was configured to do direct I/O with 256 parallel I/O's.

With the journal on the SSD:

4k random read, around 1200 iops/second, 5mbps.
4k random write, around 300 iops/second, 1.2 mbps.

Using BCache for each OSD (journal is just a file on the OSD):
4k random read, around 2200 iops/second, 9mbps.
4k random write, around 300 iops/second, 1.2 mbps.

By comparison, a 12-disk RAID5 iscsi SAN is doing ~4000 read iops and ~2000 iops write (but with 15KRPM SAS disks).

What is interesting is that bcache definitely has a positive effect on the read IOPS, but something else is being a bottle-neck for the writes.

It looks to me like I have missed something in the configuration which brings down the write IOPS - since 300 iops/second is very poor. If, however, I turn off Direct I/O in the fio tests the performance jumps to around 4000 iops/second. It makes no difference to the read performance which is to be expected.

I have tried increasing the number of threads in each OSD but that has made no difference.

I have also tried images with different (smaller) stripe sizes (--order) instead of the default 4MB but it doesnt make any difference.

Do these figures look reasonable to others? What kind of IOPS should I be expecting?

Additional info is below:

Ceph 0.72.2 running on Centos 6.5 (with custom 3.10.25 kernel for bcache support)
3 servers of the following spec:
CPU: Quad Core Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
RAM: 16GB
Disks: 4x 4TB Seagate Constellation (7.2K RPM) plus 1x Intel 160GB DC S3500 SSD

Test pool has 400 placement groups (and placement groups for placement).

fio configuration - read:
[global]
rw=randread
filename=/dev/rbd1
ioengine=posixaio
iodepth=256
direct=1
runtime=60
ramp_time=30
blocksize=4k
write_bw_log=fio-2-random-read
write_lat_log=fio-2-random-read
write_iops_log=fio-2-random-read

fio configuration - writes:
[global]
rw=randwrite
filename=/dev/rbd1
ioengine=posixaio
iodepth=256
direct=1
runtime=60
ramp_time=30
blocksize=4k
write_bw_log=fio-2-random-write
write_lat_log=fio-2-random-write
write_iops_log=fio-2-random-write


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux