On Thu, Dec 1, 2011 at 5:39 AM, Steven Dake <sdake@xxxxxxxxxx> wrote: > On 11/30/2011 02:35 PM, John Thompson wrote: >> On Thu, Dec 1, 2011 at 7:19 AM, Steven Dake <sdake@xxxxxxxxxx >> <mailto:sdake@xxxxxxxxxx>> wrote: >> >> The analysis looks good but solution looks wrong. >> >> If token is lost (the previous ring id fails for some reason) on the >> node that cancels the token timeout, the protocol will freeze. >> I know that this solution is not perfect. I'll consider more. >> this part from your logs doesn't look correct: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring >> messages from 42726-46695. >> > Nov 30 00:52:06 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 >> messages in RECOVERY. >> > ==>Costed 13 seconds to copy all old ring messages >> >> At most 3969 messages would be originated, not 16240 messages. Perhaps >> the loop is operating incorrectly? Can you print out range and >> messages_originated at line 2111 with your test case? One of these >> values may be getting corrupted by an errant memcpy. >> >> >> The message numbers printed here are in hex and so the difference is >> 16240 between them >> > > Thanks for pointing that out. :) > Yes, the message numbers are in hex :) > Regards > -steve >> Regards, >> John >> >> Insepcting the code the memset and memcpy operations look correct in >> size in the range loop (although the previous could could have an errant >> memcpy). >> >> Regards >> -steve >> >> >> On 11/30/2011 08:20 AM, Yunkai Zhang wrote: >> > I found that corosync may trigger token lost timeout in RECOVERY >> state. >> > The reason is that different nodes may spend different time to >> copy old >> > ring messages from its regular_sort_queue to retrans_message_queue in >> > *memb_state_recovery_enter* function. >> > >> > Let's have a look at the following logs(don't care about the line >> numbers, >> > I have modify the source code for debugging): >> > >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2034 my_memb:5, >> new_memb:5, trans_memb:5 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [0] member >> 192.168.0.92 <http://192.168.0.92>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [1] member >> 192.168.0.124 <http://192.168.0.124>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [2] member >> 192.168.0.142 <http://192.168.0.142>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [3] member >> 192.168.0.160 <http://192.168.0.160>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [4] member >> 192.168.0.164 <http://192.168.0.164>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [0] >> member 192.168.0.92 <http://192.168.0.92>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring >> seq 1161924 rep 192.168.0.92 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high >> delivered 46695 received flag 1 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [1] >> member 192.168.0.124 <http://192.168.0.124>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring >> seq 1161924 rep 192.168.0.92 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high >> delivered 46695 received flag 1 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [2] >> member 192.168.0.142 <http://192.168.0.142>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring >> seq 1161924 rep 192.168.0.92 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 42725 high >> delivered 42725 received flag 0 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [3] >> member 192.168.0.160 <http://192.168.0.160>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring >> seq 1161924 rep 192.168.0.92 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high >> delivered 46695 received flag 1 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [4] >> member 192.168.0.164 <http://192.168.0.164>: >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring >> seq 1161924 rep 192.168.0.92 >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high >> delivered 46695 received flag 1 >> > >> > We can see that there was five nodes forming a new ring at >> 00:51:53, they had >> > different aru value and node(192.168.0.142) had the smallest aru >> value. >> > >> > Here is some logs comes from these five nodes, we can see how much >> time they >> > spend for copying old ring messages: >> > >> > [logs from 192.168.0.92] >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old >> ring messages from 42726-46695. >> > Nov 30 00:52:06 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 >> messages in RECOVERY. >> > ==>Costed 13 seconds to copy all old ring messages >> > >> > [logs from 192.168.0.124] >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old >> ring messages from 42726-46695. >> > Nov 30 00:52:09 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 >> messages in RECOVERY. >> > ==>Costed 16 seconds to copy all old ring messages >> > >> > [logs from 192.168.0.142] >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old >> ring messages from 42726-46693. >> > Nov 30 00:51:56 corosync [TOTEM ] totemsrp.c:2145 Originated 2321 >> messages in RECOVERY. >> > ==>Costed 3 seconds to copy all old ring messages >> > >> > [logs from 192.168.0.160] >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old >> ring messages from 42726-46695. >> > Nov 30 00:51:58 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 >> messages in RECOVERY. >> > ==>Costed 5 seconds to copy all old ring messages >> > >> > [192.168.0.164] >> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old >> ring messages from 42726-46695. >> > Nov 30 00:52:05 corosync [TOTEM ] totemsrp.c:2145 Originated 16240 >> messages in RECOVERY. >> > ==>Costed 12 seconds to copy all old ring messages >> > >> > After the node(192.168.0.142) costed 3 seconds to copy all >> messages, it start >> > waiting for the commit_token from it's upstream(192.168.0.124), >> but it could >> > not received as it's upstream(192.168.0.124) was busy copying >> messages at that >> > time between 00:51:56 to 00:52:09. So this node(192.168.0.142) >> would finally >> > trigger token lost timeout in RECOVERY state. >> > >> > More seriously, after it triggered token lost timeout, the >> node(192.168.0.142) >> > enter GATHER state, but it could not receive JoinMSG from other >> nodes because >> > others was also busy at that time, and then this node ran into >> consensus >> > expired timeout state finally. >> > >> > To solve this issue, we can cancel token_timeout >> > (but keeping token_retransmit_timeout) in RECOVERY state. >> > >> > Signed-off-by: Yunkai Zhang <qiushu.zyk@xxxxxxxxxx >> <mailto:qiushu.zyk@xxxxxxxxxx>> >> > --- >> > exec/totemsrp.c | 2 +- >> > 1 files changed, 1 insertions(+), 1 deletions(-) >> > >> > diff --git a/exec/totemsrp.c b/exec/totemsrp.c >> > index 5a78962..e0299e7 100644 >> > --- a/exec/totemsrp.c >> > +++ b/exec/totemsrp.c >> > @@ -2206,7 +2206,7 @@ originated: >> > instance->my_install_seq = SEQNO_START_MSG; >> > instance->last_released = SEQNO_START_MSG; >> > >> > - reset_token_timeout (instance); // REVIEWED >> > + cancel_token_timeout (instance); >> > reset_token_retransmit_timeout (instance); // REVIEWED >> > >> > instance->memb_state = MEMB_STATE_RECOVERY; >> >> _______________________________________________ >> discuss mailing list >> discuss@xxxxxxxxxxxx <mailto:discuss@xxxxxxxxxxxx> >> http://lists.corosync.org/mailman/listinfo/discuss >> >> > > _______________________________________________ > discuss mailing list > discuss@xxxxxxxxxxxx > http://lists.corosync.org/mailman/listinfo/discuss -- Yunkai Zhang Work at Taobao _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss