Alain Moulle wrote: >> Hi >> >> I wonder which is the strategy in CS4 when >> the Heart Beat network is over-loaded for >> a while, so much that none of the nodes >> have responses on heart beat check. >> >> Do all nodes in cluster decide to fence reboot >> their neighboors and succeed to do it when the >> load on network is lessening ? >> Or what ? >> Do we have any security on this point to >> avoid the fence reboot request of CS4 towards >> all nodes in the cluster, just because the >> network is over-loaded ? >> > CMAN uses quorum to decide whether it can carry on operating after a cluster > split. If more than half of the nodes are still talking to each other then > they will have quorum and will fence the remaining nodes. > > If none of the nodes can see any other node (eg ethernet switch failure) then > none of the nodes will have quorum on its own so no fencing will be done. > > If you subsequently reconnect the nodes after that catastrophe they will all > drop out of the cluster as no node can be sure of the state of any other node > - to do so would endanger data. So you will need to restart cluster services > on all nodes. > -- patrick Hi And thanks Patrick for this detailed answer. Bu just a further question : what about the case of cluster with only two nodes where the quorum mechanism can't be applied : will we be in your second case description too(when of the nodes can see any other node) ???? Or do both nodes will immediately try to fence one each other ? Thanks Alain Moullé -- mailto:Alain.Moulle@xxxxxxxx +------------------------------+--------------------------------+ | Alain Moullé | from France : 04 76 29 75 99 | | | FAX number : 04 76 29 72 49 | | Bull SA | | | 1, Rue de Provence | Adr : FREC B1-041 | | B.P. 208 | | | 38432 Echirolles - CEDEX | Email: Alain.Moulle@xxxxxxxx | | France | BCOM : 229 7599 | +-------------------------------+-------------------------------+ -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster