Good Morning Cluster Experts, I have a 3-node cluster with Virtual Machine services. During the full-OS backup timeframe (heavy I/O activity), one of the VMs is receiving a shutdown request. It has happened 3 times in 8 weeks, to 3 different VMs. I assume the cluster is sending this shutdown message. The VM restarts immediately afterwards, likely as a result of cluster monitoring. I checked the messages log. It appears that we are not using a heartbeat, since I did not add any <totem/> to cluster.conf. This version of the cluster does not use the openais.conf file, but rather cman is started as a service of aisexec (cman 2.0). Does anyone have suggestions about what to do? Who is sending the shutdown request; is it groupd? I have two NICs configured on the nodes. Is one or both IP subnets used in the multicast? Which one? Thanks, Paul Dyer P.S. here is the messages log from a node startup showing the openais/totem portion: Mar 15 16:25:16 lxprodas1xen openais[6250]: [MAIN ] AIS Executive Service RELEASE 'subrev 1887 version 0.80.6' Mar 15 16:25:16 lxprodas1xen openais[6250]: [MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors. Mar 15 16:25:16 lxprodas1xen openais[6250]: [MAIN ] Copyright (C) 2006 Red Hat, Inc. Mar 15 16:25:16 lxprodas1xen openais[6250]: [MAIN ] AIS Executive Service: started and ready to provide service. Mar 15 16:25:16 lxprodas1xen openais[6250]: [MAIN ] Using default multicast address of 239.192.48.228 Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] Token Timeout (10000 ms) retransmit timeout (495 ms) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] token hold (386 ms) retransmits before loss (20 retrans) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] downcheck (1000 ms) fail to recv const (50 msgs) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1500 Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] window size per rotation (50 messages) maximum messages per rotation (17 messages) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] send threads (0 threads) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] RRP token expired timeout (495 ms) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] RRP token problem counter (2000 ms) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] RRP threshold (10 problem count) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] RRP mode set to none. Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] heartbeat_failures_allowed (0) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] max_network_delay (50 ms) Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0 Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes). Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] The network interface [198.62.216.73] is now up. Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] Created or loaded sequence id 660.198.62.216.73 for this ring. Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] entering GATHER state from 15. Mar 15 16:25:16 lxprodas1xen openais[6250]: [CMAN ] CMAN 2.0.115 (built Nov 19 2009 10:37:31) started Mar 15 16:25:16 lxprodas1xen openais[6250]: [MAIN ] Service initialized 'openais CMAN membership service 2.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais extended virtual synchrony service' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais cluster membership service B.01.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais availability management framework B.01.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais checkpoint service B.01.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais event service B.01.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais distributed locking service B.01.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais message service B.01.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais configuration service' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais cluster closed process group service v1.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SERV ] Service initialized 'openais cluster config database access v1.01' Mar 15 16:25:16 lxprodas1xen openais[6250]: [SYNC ] Not using a virtual synchrony filter. Mar 15 16:25:16 lxprodas1xen openais[6250]: [TOTEM] Creating commit token because I am the rep. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster