Alain Moulle wrote: > Hi > > In the Config file defaults, we can find : > > #define DEFAULT_HELLO_TIMER 5 /* Period between HELLO messages */ > #define DEFAULT_DEADNODE_TIMER 21 /* If we don't get a message from a > #define DEFAULT_MAX_RETRIES 5 /* Number of times we resend a message */ > > That seems to mean that the node sends a Hello message > on heart-beat interface every 5s, waits at max 21s before > retry and this 5 times, and if at 5th time , it has no > response in the 21s period , it decides to kill the other node. > Am I right ? Not quite. The max retries doesn't apply to heartbeat messages, only to internal messages (such as used during transitions or communicating applications). so the 21s is the total time a node is allowed to go without have a heartbeat sent (not 5x21 as you implied) > Besides, could you explain for me the JOINREQ, JOINACK, and > JOINCONF notions ? > They are to do with the joining protocol, obviously. A new node sends a JOINREQ message to a node, which responds with a JOINACK (which may be a NAK). When the cluster has completed a transition to admit the mode then a JOINCONF is sent. -- patrick -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster