Re: [PATCH v2] Fixed bug: Corosync may trigger token lost timeout in RECOVERY state

John Thompson <thompa26@xxxxxxxxx> · Thu, 1 Dec 2011 10:35:55 +1300

On Thu, Dec 1, 2011 at 7:19 AM, Steven Dake <sdake@xxxxxxxxxx> wrote:

The analysis looks good but solution looks wrong.

If token is lost (the previous ring id fails for some reason) on the

node that cancels the token timeout, the protocol will freeze.

this part from your logs doesn't look correct:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring

messages from 42726-46695.

> Nov 30 00:52:06 corosync [TOTEM ] totemsrp.c:2145 Originated 16240

messages in RECOVERY.

> ==>Costed 13 seconds to copy all old ring messages

At most 3969 messages would be originated, not 16240 messages.  Perhaps

the loop is operating incorrectly?  Can you print out range and

messages_originated at line 2111 with your test case?  One of these

values may be getting corrupted by an errant memcpy.

The message numbers printed here are in hex and so the difference is 16240 between them

Regards,
John

Insepcting the code the memset and memcpy operations look correct in

size in the range loop (although the previous could could have an errant

memcpy).

Regards

-steve

On 11/30/2011 08:20 AM, Yunkai Zhang wrote:

> I found that corosync may trigger token lost timeout in RECOVERY state.

> The reason is that different nodes may spend different time to copy old

> ring messages from its regular_sort_queue to retrans_message_queue in

> *memb_state_recovery_enter* function.

>

> Let's have a look at the following logs(don't care about the line numbers,

> I have modify the source code for debugging):

>

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2034 my_memb:5, new_memb:5, trans_memb:5

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [0] member 192.168.0.92:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [1] member 192.168.0.124:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [2] member 192.168.0.142:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [3] member 192.168.0.160:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [4] member 192.168.0.164:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [0] member 192.168.0.92:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq 1161924 rep 192.168.0.92

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high delivered 46695 received flag 1

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [1] member 192.168.0.124:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq 1161924 rep 192.168.0.92

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high delivered 46695 received flag 1

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [2] member 192.168.0.142:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq 1161924 rep 192.168.0.92

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 42725 high delivered 42725 received flag 0

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [3] member 192.168.0.160:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq 1161924 rep 192.168.0.92

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high delivered 46695 received flag 1

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [4] member 192.168.0.164:

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq 1161924 rep 192.168.0.92

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high delivered 46695 received flag 1

>

> We can see that there was five nodes forming a new ring at 00:51:53, they had

> different aru value and node(192.168.0.142) had the smallest aru value.

>

> Here is some logs comes from these five nodes, we can see how much time they

> spend for copying old ring messages:

>

> [logs from 192.168.0.92]

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring messages from 42726-46695.

> Nov 30 00:52:06 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 messages in RECOVERY.

> ==>Costed 13 seconds to copy all old ring messages

>

> [logs from 192.168.0.124]

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring messages from 42726-46695.

> Nov 30 00:52:09 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 messages in RECOVERY.

> ==>Costed 16 seconds to copy all old ring messages

>

> [logs from 192.168.0.142]

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring messages from 42726-46693.

> Nov 30 00:51:56 corosync [TOTEM ] totemsrp.c:2145 Originated 2321 messages in RECOVERY.

> ==>Costed 3 seconds to copy all old ring messages

>

> [logs from 192.168.0.160]

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring messages from 42726-46695.

> Nov 30 00:51:58 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 messages in RECOVERY.

> ==>Costed 5 seconds to copy all old ring messages

>

> [192.168.0.164]

> Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring messages from 42726-46695.

> Nov 30 00:52:05 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 messages in RECOVERY.

> ==>Costed 12 seconds to copy all old ring messages

>

> After the node(192.168.0.142) costed 3 seconds to copy all messages, it start

> waiting for the commit_token from it's upstream(192.168.0.124), but it could

> not received as it's upstream(192.168.0.124) was busy copying messages at that

> time between 00:51:56 to 00:52:09. So this node(192.168.0.142) would finally

> trigger token lost timeout in RECOVERY state.

>

> More seriously, after it triggered token lost timeout, the node(192.168.0.142)

> enter GATHER state, but it could not receive JoinMSG from other nodes because

> others was also busy at that time, and then this node ran into consensus

> expired timeout state finally.

>

> To solve this issue, we can cancel token_timeout

> (but keeping token_retransmit_timeout) in RECOVERY state.

>

> Signed-off-by: Yunkai Zhang <qiushu.zyk@xxxxxxxxxx>

> ---

>  exec/totemsrp.c |    2 +-

>  1 files changed, 1 insertions(+), 1 deletions(-)

>

> diff --git a/exec/totemsrp.c b/exec/totemsrp.c

> index 5a78962..e0299e7 100644

> --- a/exec/totemsrp.c

> +++ b/exec/totemsrp.c

> @@ -2206,7 +2206,7 @@ originated:

>       instance->my_install_seq = SEQNO_START_MSG;

>       instance->last_released = SEQNO_START_MSG;

>

> -     reset_token_timeout (instance); // REVIEWED

> +     cancel_token_timeout (instance);

>       reset_token_retransmit_timeout (instance); // REVIEWED

>

>       instance->memb_state = MEMB_STATE_RECOVERY;

_______________________________________________

discuss mailing list

discuss@xxxxxxxxxxxx

http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss