[no subject]

**Date** **Thread**

And yes, that MON example was exactly what I was aiming for, your cluster
might still have all the data (another potential failure mode of cause),
but is inaccessible. 

DRBD will see and call it a split brain, Ceph will call it a Paxos voting
failure, it doesn't matter one iota to the poor sod relying on that
particular storage.

My point was and is, when you design a cluster of whatever flavor, make
sure you understand how it can (and WILL) fail, how to prevent that from
happening if at all possible and how to recover from it if not.

Potentially (hopefully) in the case of Ceph it would be just to get a
missing MON back up.
But given that the failed MON might have a corrupted leveldb (it happened
to me) will put Robert back into square one, as in, a highly qualified
engineer has to deal with the issue. 
I.e somebody who can say "screw this dead MON, lets get a new one in" and
is capable of doing so.

Regards,

Christian

> If you are a creative admin however, you may be able to enforce split 
> brain by modifying monmaps.  In the end you'd obviously end up with two 
> distinct monitor clusters, but if you so happened to not inform the 
> clients about this there's a fair chance that it would cause havoc with 
> unforeseen effects.  Then again, this would be the operator's fault, not 
> Ceph itself -- especially because rewriting monitor maps is not trivial 
> enough for someone to mistakenly do something like this.
> 
>    -Joao
> 
> 

-- 
Christian Balzer        Network/Systems Engineer                
chibi at gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/