After an upgrade from 5.2 to 5.3, the cluster, named GFSCluster, seems to stop being a cluster. GFSCluster is a 2 node cluster using iscsi, cman, clvm, and gfs and it was working fine when it was on 5.2 The configuration on both of the nodes (passwords removed) <?xml version="1.0"?> <cluster name="GFSCluster" config_version="5"> <cman expected_votes="1" two_node="1"/> <clusternodes><clusternode name="node01.company.com" votes="1" nodeid="1"><fence><method name="single"><device name="node01_ipmi"/></method></fence></clusternode><clusternode name="node02.company.com" votes="1" nodeid="2"><fence><method name="single"><device name="node02_ipmi"/></method></fence></clusternode></clusternodes> <fencedevices><fencedevice name="node01_ipmi" agent="fence_ipmilan" ipaddr="10.1.0.5" login="root" passwd="********"/><fencedevice name="node02_ipmi" agent="fence_ipmilan" ipaddr="10.1.0.7" login="root" passwd="********"/></fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster> When starting the service cman, they both hang on the part starting fencing Starting cluster: Loading modules... done Mounting configfs... done Starting ccsd... done Starting cman... done Starting daemons... done Starting fencing... After 5 minutes the task finishes with "done" but clustat says ==== As root on web01.company.com ==== Cluster Status for GFSCluster @ Wed Jul 8 01:00:24 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node01.company.com 1 Online, Local node02.company.com 2 Offline ==== As root on web02.company.com ==== Cluster Status for GFSCluster @ Wed Jul 8 01:00:26 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node01.company.com 1 Offline node02.company.com 2 Online, Local They are both quorate with their own cluster In the logs of web01 I found repeating messages Jul 8 00:55:27 web01 fenced[21872]: node02.company.com not a cluster member after 6 sec post_join_delay Jul 8 00:55:27 web01 fenced[21872]: fencing node "node02.company.com" Jul 8 00:55:52 web01 fenced[21872]: agent "fence_ipmilan" reports: Rebooting machine @ IPMI:10.1.0.7...ipmilan: Failed to connect after 30 seconds Failed In the logs of web02 I also found the same repeating messages Jul 8 00:55:27 web02 fenced[6363]: node01.company.com not a cluster member after 6 sec post_join_delay Jul 8 00:55:27 web02 fenced[6363]: fencing node "node01.company.com" Jul 8 00:55:53 web02 fenced[6363]: agent "fence_ipmilan" reports: Rebooting machine @ IPMI:10.1.0.5...ipmilan: Failed to connect after 30 seconds Failed Is there a bug on 5.3 with regards to clustering? Is there any workarounds? Feel safer online. Upgrade to the new, safer Internet Explorer 8 optimized for Yahoo! to put your mind at peace. It's free. Get IE8 here! http://downloads.yahoo.com/sg/internetexplorer/ -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster