Alain Moulle wrote: > Alain Moulle wrote: > > >>>Hi >>> >>>I wonder which is the strategy in CS4 when >>>the Heart Beat network is over-loaded for >>>a while, so much that none of the nodes >>>have responses on heart beat check. >>> >>>Do all nodes in cluster decide to fence reboot >>>their neighboors and succeed to do it when the >>>load on network is lessening ? >>>Or what ? >>>Do we have any security on this point to >>>avoid the fence reboot request of CS4 towards >>>all nodes in the cluster, just because the >>>network is over-loaded ? >>> > > > >>CMAN uses quorum to decide whether it can carry on operating after a cluster >>split. If more than half of the nodes are still talking to each other then >>they will have quorum and will fence the remaining nodes. >> >>If none of the nodes can see any other node (eg ethernet switch failure) then >>none of the nodes will have quorum on its own so no fencing will be done. >> >>If you subsequently reconnect the nodes after that catastrophe they will all >>drop out of the cluster as no node can be sure of the state of any other node >>- to do so would endanger data. So you will need to restart cluster services >>on all nodes. >>-- patrick > > > Hi > And thanks Patrick for this detailed answer. > Bu just a further question : what about the case of cluster > with only two nodes where the quorum mechanism can't be > applied : will we be in your second case description too(when > of the nodes can see any other node) ???? cman has a special "two_node" node which allows the cluster to continue with only one vote. There is simply a race to see which node gets fenced first! > Or do both nodes will immediately try to fence one each other ? > Thanks yes! -- patrick -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster