full osd ssd cluster advise : replication 2x or 3x ?

chibi@xxxxxxx (Christian Balzer) · Fri, 23 May 2014 11:57:51 +0900

Hello,

On Thu, 22 May 2014 18:00:56 +0200 (CEST) Alexandre DERUMIER wrote:

> Hi,
> 
> I'm looking to build a full osd ssd cluster, with this config:
> 
What is your main goal for that cluster, high IOPS, high sequential writes
or reads?

Remember my "Slow IOPS on RBD..." thread, you probably shouldn't expect
more than 800 write IOPS and 4000 read IOPS per OSD (replication 2).

> 6 nodes,
> 
> each node 10 osd/ ssd drives (dual 10gbit network).  (1journal + datas
> on each osd)
> 
Halving the write speed of the SSD, leaving you with about 2GB/s max write
speed per node.

If you're after good write speeds and with a replication factor of 2 I
would split the network into public and cluster ones.
If you're however after top read speeds, use bonding for the 2 links into
the public network, half of your SSDs per node are able to saturate that.

> ssd drive will be entreprise grade, 
> 
> maybe intel sc3500 800GB (well known ssd)
> 
How much write activity do you expect per OSD (remember that you in your
case writes are doubled)? Those drives have a total write capacity of
about 450TB (within 5 years).

> or new Samsung SSD PM853T 960GB (don't have too much info about it for
> the moment, but price seem a little bit lower than intel)
> 

Looking at the specs it seems to have a better endurance (I used
500GB/day, a value that seemed realistic given the 2 numbers they gave),
at least double that of the Intel. 
Alas they only give a 3 year warranty, which makes me wonder.
Also the latencies are significantly higher than the 3500.

> 
> I would like to have some advise on replication level,
> 
> 
> Maybe somebody have experience with intel sc3500 failure rate ?

I doubt many people have managed to wear out SSDs of that vintage in
normal usage yet. And so far none of my dozens of Intel SSDs (including
some ancient X25-M ones) have died.

> How many chance to have 2 failing disks on 2 differents nodes at the
> same time (murphy's law ;).
> 
Indeed.