Re: Journals on all SSD cluster

Alexandre DERUMIER <aderumier@xxxxxxxxx> · Thu, 22 Jan 2015 10:47:34 +0100 (CET)

> From my last benchmark,
>>Using which version of Ceph?

It was with giant (big improvements with threads sharding, so you can use more cores by osd)

>>That was with replication of 1, if I remember right? 
for reads, I don't have too much difference with replication 1,2 or 3, but my client was cpu limited

for write:

replication x1 : 22000iops
            x2  : 14000 iops
            x3   : 9000iops

(osd cpu was clearly the bottleneck for writes)

Certainly better than spinning rust, but as I said back than, 3 nodes and 
6 SSDs to barely get the write performance of a single SSD. 

>>Any reason for the S3500s, other than the obvious (price)? 
yes, the price ;)
we are going to use new 1,6TB s3500. ( endurance 880 TBW)

>>As in, expecting a light write load? 
Yes, indeed, we don't have too much write load. (80%read / 20%write)

----- Mail original -----
De: "Christian Balzer" <chibi@xxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Cc: "aderumier" <aderumier@xxxxxxxxx>
Envoyé: Jeudi 22 Janvier 2015 09:37:03
Objet: Re:  Journals on all SSD cluster

Hello, 

On Thu, 22 Jan 2015 08:32:13 +0100 (CET) Alexandre DERUMIER wrote: 

> Hi, 
> 
> From my last benchmark, 
Using which version of Ceph? 

> I was around 120000 iops rand read 4k , 20000iops rand write 4k (3 
> nodes with 2ssd osd+journal ssd intel 3500) 
> 
That was with replication of 1, if I remember right? 

Certainly better than spinning rust, but as I said back than, 3 nodes and 
6 SSDs to barely get the write performance of a single SSD. 

> My main bottleneck was cpu (it's was 2x4cores 1,4ghz intel), both on osd 
> and client. 
> 
> 
> I'm going to test next month my production cluster, with bigger nodes 
> (2x10cores 3,1ghz), with a cluster with 3 nodes with 6 osd intel s3500 
> 1,6TB by node. And same cpu config for the clients. 
> 
Any reason for the S3500s, other than the obvious (price)? 
As in, expecting a light write load? 

> I'll try to post full benchmark results next month (including qemu-kvm 
> optimisations) 
> 
Looking forward to that. 

Christian 

> Regards 
> 
> Alexandre 
> 
> ----- Mail original ----- 
> De: "Christian Balzer" <chibi@xxxxxxx> 
> À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> 
> Envoyé: Jeudi 22 Janvier 2015 01:28:58 
> Objet: Re:  Journals on all SSD cluster 
> 
> Hello, 
> 
> On Wed, 21 Jan 2015 23:28:15 +0100 Sebastien Han wrote: 
> 
> > It has been proven that the OSDs can’t take advantage of the SSD, so 
> > I’ll probably collocate both journal and osd data. Search in the ML 
> > for [Single OSD performance on SSD] Can't go over 3, 2K IOPS 
> > 
> > You will see that there is no difference it terms of performance 
> > between the following: 
> > 
> > * 1 SSD for journal + 1 SSD for osd data 
> > * 1 SSD for both journal and data 
> > 
> Very, very true. 
> And that would also be the case in any future where the Ceph code gets 
> closer to leverage full SSD performance. 
> 
> Now where splitting things _may_ make sense would be if you had 
> different types of SSDs, like fast and durable DC S3700s versus less 
> durable and slower (but really still too fast for Ceph) ones like 
> Samsung 845DC Evo. In that case putting the journal on the Intels would 
> double the lifetime of the Samsungs, while hardly making a dent on the 
> Intels endurance. 
> 
> > What you can do in order to max out your SSD is to run multiple 
> > journals and osd data on the same SSD. Something like this gave me 
> > more IOPS: 
> > 
> > * /dev/sda1 ceph journal 
> > * /dev/sda2 ceph data 
> > * /dev/sda3 ceph journal 
> > * /dev/sda4 ceph data 
> > 
> Yup, the limitations are in the Ceph OSD code right now. 
> 
> However a setup like this will of course kill multiple OSDs if a single 
> SSD fails, not that it matters all that much with normal CRUSH rules. 
> 
> Christian 
> 
> > > On 21 Jan 2015, at 04:32, Andrew Thrift <andrew@xxxxxxxxxxxxxxxxx> 
> > > wrote: 
> > > 
> > > Hi All, 
> > > 
> > > We have a bunch of shiny new hardware we are ready to configure for 
> > > an all SSD cluster. 
> > > 
> > > I am wondering what are other people doing for their journal 
> > > configuration on all SSD clusters ? 
> > > 
> > > - Seperate Journal partition and OSD partition on each SSD 
> > > 
> > > or 
> > > 
> > > - Journal on OSD 
> > > 
> > > 
> > > Thanks, 
> > > 
> > > 
> > > 
> > > 
> > > Andrew 
> > > _______________________________________________ 
> > > ceph-users mailing list 
> > > ceph-users@xxxxxxxxxxxxxx 
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> > 
> > 
> > Cheers. 
> > –––– 
> > Sébastien Han 
> > Cloud Architect 
> > 
> > "Always give 100%. Unless you're giving blood." 
> > 
> > Phone: +33 (0)1 49 70 99 72 
> > Mail: sebastien.han@xxxxxxxxxxxx 
> > Address : 11 bis, rue Roquépine - 75008 Paris 
> > Web : www.enovance.com - Twitter : @enovance 
> > 
> 
> 

-- 
Christian Balzer Network/Systems Engineer 
chibi@xxxxxxx Global OnLine Japan/Fusion Communications 
http://www.gol.com/ 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com