Re: Poor IOPS performance with Ceph

Daleep Bais <daleepbais@xxxxxxxxx> · Wed, 9 Sep 2015 14:07:08 +0530

Hi Nick,

I dont have separate SSD / HDD for journal. I am using a 10 G partition on the same HDD for journaling. They are rotating HDD's and not SSD's.

I am using below command to run the test:

fio --name=test --filename=test --bs=4k  --size=4G --readwrite=read / write

I did few kernel tuning and that has improved my write IOPS. For read I am using rbd_readahead  and also used read_ahead_kb kernel tuning parameter.

Also I should mention that its not x86, its on armv7 32bit.

Thanks.

Daleep Singh Bais

On Wed, Sep 9, 2015 at 1:55 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> -----Original Message-----

> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of

> Daleep Bais

> Sent: 09 September 2015 09:18

> To: Ceph-User <ceph-users@xxxxxxxx>

> Subject:  Poor IOPS performance with Ceph

>

> Hi,

>

> I have made a test ceph cluster of 6 OSD's and 03 MON. I am testing the read

> write performance for the test cluster and the read IOPS is  poor.

> When I individually test it for each HDD, I get good performance, whereas,

> when I test it for ceph cluster, it is poor.

Can you give any further details about your cluster. Are your HDD's backed by SSD journals?

>

> Between nodes, using iperf, I get good bandwidth.

>

> My cluster info :

>

> root@ceph-node3:~# ceph --version

> ceph version 9.0.2-752-g64d37b7

> (64d37b70a687eb63edf69a91196bb124651da210)

> root@ceph-node3:~# ceph -s

>     cluster 9654468b-5c78-44b9-9711-4a7c4455c480

>      health HEALTH_OK

>      monmap e9: 3 mons at {ceph-node10=192.168.1.210:6789/0,ceph-

> node17=192.168.1.217:6789/0,ceph-node3=192.168.1.203:6789/0}

>             election epoch 442, quorum 0,1,2 ceph-node3,ceph-node10,ceph-

> node17

>      osdmap e1850: 6 osds: 6 up, 6 in

>       pgmap v17400: 256 pgs, 2 pools, 9274 MB data, 2330 objects

>             9624 MB used, 5384 GB / 5394 GB avail

>                  256 active+clean

>

>

> I have mapped an RBD block device to client machine (Ubuntu 14) and from

> there, when I run tests using FIO, i get good write IOPS, however, read is

> poor comparatively.

>

> Write IOPS : 44618 approx

>

> Read IOPS : 7356 approx

1st thing that strikes me is that your numbers are too good, unless these are actually SSD's and not spinning HDD's? I would expect to get around a max of 600 read IOPs for 6x 7.2k disks, so I guess either you are hitting the page cache on the OSD node(s) or the librbd cache.

The writes are even higher, are you using the "direct=1" option in the Fio job?

>

> Pool replica - single

> pool 1 'test1' replicated size 1 min_size 1

>

> I have implemented rbd_readahead in my ceph conf file also.

> Any suggestions in this regard with help me..

>

> Thanks.

>

> Daleep Singh Bais

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com