[TOTEM] The token was lost in the OPERATIONAL state: explanation?

Jos Vos <jos@xxxxxx> · Sat, 10 Nov 2007 23:05:40 +0100

Hi,

In a two-node cluster, a few times per day one of the nodes (not always
the same) reboots because it is fenced by the other node.  The logging
on the fencing node starts with:

Nov 10 22:30:14 node2 openais[3275]: [TOTEM] The token was lost in the OPERATIONAL state.
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes).
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] entering GATHER state from 2.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering GATHER state from 0.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Creating commit token because I am the rep.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Saving state aru 32fc3 high seq received 32fc3
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering COMMIT state.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering RECOVERY state.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] position [0] member <ip-addr-of-node-2>:
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] previous ring seq 56 rep <ip-addr-of-node-1>
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] aru 32fc3 high delivered 32fc3 received flag 0
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Did not need to originate any messages in recovery.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Storing new sequence id for ring 3c
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Sending initial ORF token

On the fenced node, in most cases nothing is logged before the reboot.
A few times, a "fatal: filesystem consistency error" was reported on
the fenced node just before the reboot.

Should I assume that in case nothing is logged this is also caused by a
fs error, although the log was not wriiten to disk in time before being
fenced?

Thanks,

--
--    Jos Vos <jos@xxxxxx>
--    X/OS Experts in Open Systems BV   |   Phone: +31 20 6938364
--    Amsterdam, The Netherlands        |     Fax: +31 20 6948204

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster