Re: How is split brain situations handled in ceph?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If your clustering something important do it at the application level. For example financial transactions are replicated at the application level just for this reason. 

As far as Ceph I'm not an expert yet. Even with all the file system wizardry in the world some things need to be handled outside the block or object level. 

For geographical fail over of systems our company requires a human to make the switch so we never have the same active system running in two places.   Then your second copy is just out of sync and not a completely new set.  Otherwise a fiber cut and you've got two running servers creating two different pools. Reaching convergence of the two has a lot of ifs and buts.  Anything is possible, but to what extent will you need to travel is the question. 

I'm curious if one of the geniuses here have found a way to unscramble eggs but I don't count on my file system to be in charge of situations like this, only to keep as many copies as I need for a human to pick what happens next.  Sometimes that means picking a winner and losing data written to an island system. 

Rob

Sent from my iPhone

> On Oct 26, 2016, at 8:55 AM, Andreas Davour <ante@xxxxxxxxxxxx> wrote:
> 
> 
> Hi
> 
> I was talking about a potential ceph setup, and a question arose about how ceph handles a potential split brain situation. I had my own ideas on how that would be handled, but I want to consult with the wider knowledge base here to verify my understanding.
> 
> So, let's say we have two data centres. Replication is configured so there are 3 replicas, and at least one copy in each data centre. Also, there are an odd numer of MONs in the cluster.
> 
> If we now get a net split, so we end up with 2 replicas in one dc (A), 1
> in the other (B). In theory we should be good, as no data is lost and if there are more than one OSD in B it will re-balance.
> 
> But what happens now when it comes to writes? If we write to both sides of the split, we've lost.
> 
> If there are 1 MON in B, that cluster will have quorum within itself and keep running, and in A the MON cluster will vote and reach quorum again. In that case we have two clusters, both accepting writes to the same objects in those 2 and 1 replicas.
> 
> So, how does the praxos, crush and other protocols make sure not both sides of the split is active?
> 
> Pointers in the documentation appreciated, as well as other explanations.
> 
> /andreas
> 
> --
> "economics is a pseudoscience; the astrology of our time"
> Kim Stanley Robinson
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux