Workings of Tiebreaker IP (RHCS)

Rick Rodgers <rodgersr@xxxxxxxxx> · Sat, 23 Sep 2006 17:20:57 -0700 (PDT)

  I  pulled a message from 2005  about tiebreakers. I have  some questions  and it does not seem to agree  with what I see culmanger do.

>> Hello,
>> 
>> To completely understand what the role of a tiebreaker IP within a two
>> or four node RHCS cluster is, I've searched redhat and Google. I can't
>> however find anything describing the precise workings of the
>> tiebreaker-IP. I would really like to know what happens excactly when
>> the tiebreaker is used an how (maybe even somekind of flow diagram). 
>> 
>> Can
 anyone here maybe explain that to me, or point me in the direction
>> of more specific information regarding tiebreaker?

>The tiebreaker IP address is used as an additional vote in the event
>that half the nodes become unreachable or dead in a 2 or 4 node >cluster
>on RHCS.

>The IP address must reside on the same network as is used for cluster
>communication.  To be a little more specific, if your cluster is using
>eth0 for communication, your IP address used for a tiebreaker must be
>reachable only via eth0 (otherwise, you will end up with a split >brain).

>When enabled, the nodes ping the given IP address at regular
 >intervals.
>When the IP address is not reachable, the tiebreaker is considered
>"dead".  When it is reachable, it is considered "alive".

>It acts as an additional vote (like an extra cluster member), except >for
>one key difference: Unless the default configuration is overridden, >the

How  does this  work? Does the node trying to become the active node access the tiebreaker and put a lock on it? How does it reseve it? 
Just  pinging it  would not prevent the other node from doing the same.

>IP tiebreaker may not be used to *form* a quorum where one did not
 >exist
>previously.

>So, if one node of a two node cluster is online, it will never become
>quorate unless the other node comes online (or administrator override,
>see man pages for "cluforce" and "cludb").

>So, in a 2 node cluster, if one node fails and the other node is >online
>(and the tiebreaker is still "alive" according to that node), the
>remaining node considers itself quorate and "shoots" (aka STONITHs, >aka
>fences) the dead node and takes over services.

>If a network partition occurs such that both nodes see the tiebreaker
>but not each other, the first one to fence the other will naturally
 >win.

>Ok, moving on...

>The disk tiebreaker works in a similar way, except that it lets the
>cluster limp in along in a safe, semi-split-brain (split brain) in a
>network outage.  What I mean is that because there's state information
>written to/read from the shared raw partitions, the nodes can actually
>tell via other means whether or not the other node is "alive" or not >as
>opposed to relying solely on the network traffic.

>Both nodes update state information on the shared partitions.  When >one
>node detects that the other node has not updated its information for
 a
>period of time, that node is "down" according to the disk subsystem.  >If
>this coincides with a "down" status from the membership daemon, the >node
>is fenced and services are failed over.  If the node never goes down
>(and keeps updating its information on the shared partitions), then >the
I do not use a IP tiebreaker. I have a two nodes system. When the active node shows it is down via memebership but up  via disk then
Clumanager determines it is in an ?uncertain state? and shoots it. 

>node is never fenced and services never fail over.

-- Lon

Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls.  Great rates starting at 1¢/min.--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster