Re: Failure probability with largish deployments

Christian Balzer <chibi@xxxxxxx> · Fri, 20 Dec 2013 12:33:22 +0900

Hello,

On Thu, 19 Dec 2013 12:42:15 +0100 Wido den Hollander wrote:

> On 12/19/2013 09:39 AM, Christian Balzer wrote:
[snip]
> >
> 
> I'd suggest to use different vendors for the disks, so that means you'll 
> probably be mixing Seagate and Western Digital in such a setup.
> 
That's funny, because I wouldn't use either of these vendors these days,
in fact it is likely years before I will consider Seagate again, if ever.
WD are in comparison overpriced and of lower performance (I do love the
Velociraptors for certain situations though). 
Since WD also bought Hitachi I am currently pretty much stuck with
Toshiba drives. ^.^
That all said, I know where you're coming from and on principle I'd agree.

Also buying the "same" 3TB disk from different vendors for vastly differing
prices is also going to mean battle with the people paying for the
hardware. ^o^

> In this case you can also rule out batch issues with disks, but the 
> likelihood of the same disks failing becomes smaller as well.
> 
> Also, make sure that you define your crushmap that replicas never and up 
> on the same physical host and if possible not in the same cabinet/rack.
> 
One would hope that to be a given, provided the correct input was made.
People seem to be obsessed by rack failures here, in my case everything
(switches, PDUs, dual PSUs per server) is redundant per rack, so no SPOF,
no particular likelihood for a rack to fail in its entirety. 

> I would never run with 60 drives in a single machine in a Ceph cluster, 
> I'd suggest you use more machines with less disks per machine.
> 
This was given as an example to show how quickly and in how little space
you can reach an amount of disks that pushes up failure probabilities to
near certainty. 
And I would do deploy such a cluster if I had the need, simply use n+1 or
n+2 machines. So be on the safe side, instead of 10 servers deploy 12, six
per rack, set your full ratio to 80%. 
And of course reduce the overhead and failure likelihood further by doing
local RAIDs, for example 4x 14HD RAID6 and 4 global hotspares per node.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com