Here is my cluster.conf ######################################### <?xml version="1.0"?> <cluster alias="myiacon" config_version="16" name="myiacon"> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="60"/> <clusternodes> <clusternode name="ratchet.local" nodeid="1" votes="1"> <fence> <method name="1"> <device name="ratchet_ipmi"/> </method> </fence> </clusternode> <clusternode name="skydive.local" nodeid="2" votes="1"> <fence> <method name="1"> <device name="skydive_ipmi"/> </method> </fence> </clusternode> <clusternode name="wheeljack.local" nodeid="3" votes="1"> <fence> <method name="1"> <device name="wheeljack_ipmi"/> </method> </fence> </clusternode> </clusternodes> <cman/> <fencedevices> <fencedevice agent="fence_ipmilan" ipaddr="192.168.1.100" login="root" name="ratchet_ipmi" passwd="xxxxx"/> <fencedevice agent="fence_ipmilan" ipaddr="192.168.1.102" login="root" name="skydive_ipmi" passwd="xxxxx"/> <fencedevice agent="fence_ipmilan" ipaddr="192.168.1.101" login="root" name="wheeljack_ipmi" passwd="xxxxxx"/> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster> ############################################# And here is one of the errors I just started getting: Sep 29 08:10:06 wheeljack openais[5453]: [MAIN ] Killing node ratchet.local beca use it has rejoined the cluster with existing state But half the time, servers just complain that they cant reconnect to the cluster. -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Mark Chaney Sent: Monday, September 29, 2008 3:07 AM To: linux-cluster@xxxxxxxxxx Subject: proper cluster crash procedures? I have a 3 node cluster that has shared storage using iscsi san, hence I am using GFS. Anyway, I had it crash for whatever reason, not sure if something was rebooted incorrectly or what, but now I have been spending the past 2 hours trying to get the cluster back up. I would think that sampling rebooting all the nodes would work, but heck, that hasn't. What should I be doing? Should I just start up one at a time? BTW, I am using ipmi for fencing if that makes a difference. I can post my cluster.conf if that's helpful, but I would think there would be general techniques available. Thanks, Mark -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster