On 12/10/2011 02:27 AM, Yunkai Zhang wrote: > After the representative of the new ring entered into RECOVERY state, > it was waiting for the commit_token on the second rotation. But when > it got a ncommit_token, it didn't check the status of this token > which could be retransmitted by upstream node on the first rotation, > and then the representative node would send the initial ORF token > prematurely. > > If other nodes which have not entered into RECOVERY state received this > incorrect initial orf_token(which seq is 0), these nodes may crash > caused by assert instraction in *orf_token_rtr* function: > ... Yunkai, First off thanks for digging into Totem. I know the protocol is hard to understand and its nice to have more people to bounce ideas off of. Do you have a test case for this? Patch is a clever solution but the root cause seems incorrect. I'd like to get to the RCA of your bug. Just thinking about some of the problems you have reported in the past (13 seconds to execute a recovery copy per node), it could be possible a retransmission of the commit token from 5 to 1 (kicking off a new orf token) in the below example could occur while in recovery. That should be ok though, since retransmitted orf tokens are discarded (totemsrp:3564). I don't see any immediate way to have a premature token transmission. Lets examine 5 processors: When in gather, a membership is built. Once this membership is agreed upon, a commit token is originated by the ring rep creating the first transmission around the ring. totemsrp.c:4044 Example with 5 processors: 1,2,3,4,5 in gather 1 originates commit token Once a commit token is received in the gather state, the processor enters commit (first token rotation) totemsrp.c:4348 Example: 1 goes to commit 2 goes to commit 3 goes to commit 4 goes to commit 5 goes to commit Once a commit token is received in the commit state, the processor enters the recovery state only if this is a new commit token totemsrp.c:4360 1 goes to recovery 2 goes to recovery 3 goes to recovery 4 goes to recovery 5 goes to recovery 1 originates orf token (this is the start of the third rotation) totemsrp.c:4374 In the above failure free scenario, we see that we will never get a premature orf token. Perhaps the problem your running into is your mixing partition/merging while forming a new ring. This could result in my_id.addr[0] and ring rep matching, but *old* commit tokens not being discarded by the ring rep in the recovery state. I believe the code would be better served by something like: if (totemip_equal (&instance->my_id.addr[0], &instance->my_ring_id.rep) && memb_commit_token->ring_id.seq == instance->my_ring_id_seq) { ... } Note it may be safer to compare the entire ring_id rather then only the seq. This will cause any commit token not from the current agreed processors to be rejected. The ring seq increases by 4 for each new ring and it is not possible to have a matching seq + rep in two separate rings (this is why we store the ring seq to stable storage). Could you test that or tell me your test case or both? Thanks -steve > range = orf_token->seq - instance->my_aru; > assert (range < QUEUE_RTR_ITEMS_SIZE_MAX); > ... > > Signed-off-by: Yunkai Zhang <qiushu.zyk@xxxxxxxxxx> > --- > exec/totemsrp.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/exec/totemsrp.c b/exec/totemsrp.c > index 5a78962..22717ac 100644 > --- a/exec/totemsrp.c > +++ b/exec/totemsrp.c > @@ -4372,7 +4372,8 @@ static int message_handler_memb_commit_token ( > break; > > case MEMB_STATE_RECOVERY: > - if (totemip_equal (&instance->my_id.addr[0], &instance->my_ring_id.rep)) { > + if (totemip_equal (&instance->my_id.addr[0], &instance->my_ring_id.rep) && > + memb_commit_token->token_seq == 2*memb_commit_token->addr_entries) { > log_printf (instance->totemsrp_log_level_debug, > "Sending initial ORF token\n"); > _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss