Re: Journals on all SSD cluster

Alexandre DERUMIER <aderumier@xxxxxxxxx> · Thu, 22 Jan 2015 08:32:13 +0100 (CET)

Hi,

>From my last benchmark,
I was around 120000 iops rand read 4k  , 20000iops rand write 4k  (3 nodes with 2ssd osd+journal ssd intel 3500)

My main bottleneck was cpu (it's was 2x4cores 1,4ghz intel), both on osd and client.

I'm going to test next month my production cluster, with bigger nodes (2x10cores 3,1ghz),
with a cluster with 3 nodes with 6 osd intel s3500 1,6TB by node.
And same cpu config for the clients.

I'll try to post full benchmark results next month (including qemu-kvm optimisations) 

Regards

Alexandre

----- Mail original -----
De: "Christian Balzer" <chibi@xxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Jeudi 22 Janvier 2015 01:28:58
Objet: Re:  Journals on all SSD cluster

Hello, 

On Wed, 21 Jan 2015 23:28:15 +0100 Sebastien Han wrote: 

> It has been proven that the OSDs can’t take advantage of the SSD, so 
> I’ll probably collocate both journal and osd data. Search in the ML for 
> [Single OSD performance on SSD] Can't go over 3, 2K IOPS 
> 
> You will see that there is no difference it terms of performance between 
> the following: 
> 
> * 1 SSD for journal + 1 SSD for osd data 
> * 1 SSD for both journal and data 
> 
Very, very true. 
And that would also be the case in any future where the Ceph code gets 
closer to leverage full SSD performance. 

Now where splitting things _may_ make sense would be if you had different 
types of SSDs, like fast and durable DC S3700s versus less durable and 
slower (but really still too fast for Ceph) ones like Samsung 845DC Evo. 
In that case putting the journal on the Intels would double the lifetime 
of the Samsungs, while hardly making a dent on the Intels endurance. 

> What you can do in order to max out your SSD is to run multiple journals 
> and osd data on the same SSD. Something like this gave me more IOPS: 
> 
> * /dev/sda1 ceph journal 
> * /dev/sda2 ceph data 
> * /dev/sda3 ceph journal 
> * /dev/sda4 ceph data 
> 
Yup, the limitations are in the Ceph OSD code right now. 

However a setup like this will of course kill multiple OSDs if a single 
SSD fails, not that it matters all that much with normal CRUSH rules. 

Christian 

> > On 21 Jan 2015, at 04:32, Andrew Thrift <andrew@xxxxxxxxxxxxxxxxx> 
> > wrote: 
> > 
> > Hi All, 
> > 
> > We have a bunch of shiny new hardware we are ready to configure for an 
> > all SSD cluster. 
> > 
> > I am wondering what are other people doing for their journal 
> > configuration on all SSD clusters ? 
> > 
> > - Seperate Journal partition and OSD partition on each SSD 
> > 
> > or 
> > 
> > - Journal on OSD 
> > 
> > 
> > Thanks, 
> > 
> > 
> > 
> > 
> > Andrew 
> > _______________________________________________ 
> > ceph-users mailing list 
> > ceph-users@xxxxxxxxxxxxxx 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
> 
> Cheers. 
> –––– 
> Sébastien Han 
> Cloud Architect 
> 
> "Always give 100%. Unless you're giving blood." 
> 
> Phone: +33 (0)1 49 70 99 72 
> Mail: sebastien.han@xxxxxxxxxxxx 
> Address : 11 bis, rue Roquépine - 75008 Paris 
> Web : www.enovance.com - Twitter : @enovance 
> 

-- 
Christian Balzer Network/Systems Engineer 
chibi@xxxxxxx Global OnLine Japan/Fusion Communications 
http://www.gol.com/ 
_______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com