On 07/30/2013 07:14 PM, Russell Jones wrote:
Hi all,
I am trying to understand how the corosync token, token_retansmit, and
token_retransmit_before_loss_const variables all tie in together.
I have a standard RHCS v3 cluster set up and running. The token
timeout is set to 10000. When testing it seems to detect failed
members pretty consistently within 10 seconds. What I am not
understanding is *when* a node is declared dead, and a fence call is
actually made. The man pages show that the cluster is reconfigured
when the "token" time is reached, and also when
token_retransmits_before_loss_const is reached. This is confusing :-)
I agree, after reading the man page, it appears a bit confusing. Better
wording for token_retransmits_before_loss_const would be:
token_retransmits_before_loss_const
This value identifies how many token retransmits should be attempted.
If no token is received by the next processor in the ring before token
expires, a new configuration will be formed. If this value is set,
retransmit and hold will be automatically calculated from
retransmits_before_loss and token.
The default is 4 retransmissions.
I've submitted an upstream manual page change.
Which one is it that will reform the cluster? Both? When does one
taken precedence over the other?
Thanks!
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster