On 09/03/15 01:09, renayama19661014@xxxxxxxxx wrote: > Hi All, > > We constitute a cluster in corosync. > We shutdown one node afterwards. > > Then the node that we shutdown is sometimes judged with fail by a cluster. > > --------------------------------------- > Oct 21 11:03:30 XXX corosync[21677]: [TOTEM ] A processor failed, forming new configuration. > --------------------------------------- > > This phenomenon seems to occur with very low probability. > > We think that it is a problem that there is the log that the node that we shutdown is taken as trouble. > > The problem is because the leave message(memb_leave_message_send) which is sent when a user stops corosync may be thrown away. > > static int net_deliver_fn ( > int fd, > int revents, > void *data) > { > struct totemudp_instance *instance = (struct totemudp_instance *)data; > struct msghdr msg_recv; > struct iovec *iovec; > (snip) > /* > * Drop all non-mcast messages (more specifically join > * messages should be dropped) > */ > message_type = (char *)iovec->iov_base; > if (instance->flushing == 1 && *message_type == MESSAGE_TYPE_MEMB_JOIN) { > iovec->iov_len = FRAME_SIZE_MAX; > return (0); > } > (snip) > > A secession leave is handled definitely and wishes a node stops. > Is the correction of the handling of problem of corosync possible? > * We think that it is a problem that there is the log that the node that we shutdown is taken as trouble. Yes, I think I can see what's happening here. JOIN messages get discarded during flushing because that can cause entry into GATHER state at an inappropriate time. For a normal JOIN message that's fine because the joining node will re-send the message. But this also causes LEAVE messages to be discarded too (as they are a special case of JOIN). This causes the error you are seeing. The fix is non-trivial, sadly, but I'm looking into it Thanks for the report! Chrissie > We hope that this problem is revised in the next version if possible. > > > Best Regards, > Hideo Yamauchi. > > > _______________________________________________ > discuss mailing list > discuss@xxxxxxxxxxxx > http://lists.corosync.org/mailman/listinfo/discuss > _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss