Re: Cannot make cluster after upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sounds a little split-brainish....... have you tried the clean_start=1 option?

On Jul 7, 2009 11:54 PM, "Abed-nego G. Escobal, Jr." <abednegoyulo@xxxxxxxxx> wrote:


After an upgrade from 5.2 to 5.3, the cluster, named GFSCluster, seems to stop being a cluster. GFSCluster is a 2 node cluster using iscsi, cman, clvm, and gfs and it was working fine when it was on 5.2 The configuration on both of the nodes (passwords removed)

<?xml version="1.0"?>
<cluster name="GFSCluster" config_version="5">
<cman expected_votes="1" two_node="1"/>
 <clusternodes><clusternode name="node01.company.com" votes="1" nodeid="1"><fence><method name="single"><device name="node01_ipmi"/></method></fence></clusternode><clusternode name="node02.company.com" votes="1" nodeid="2"><fence><method name="single"><device name="node02_ipmi"/></method></fence></clusternode></clusternodes>
 <fencedevices><fencedevice name="node01_ipmi" agent="fence_ipmilan" ipaddr="10.1.0.5" login="root" passwd="********"/><fencedevice name="node02_ipmi" agent="fence_ipmilan" ipaddr="10.1.0.7" login="root" passwd="********"/></fencedevices>
 <rm>
   <failoverdomains/>
   <resources/>
 </rm>
</cluster>

When starting the service cman, they both hang on the part starting fencing

Starting cluster:
  Loading modules... done
  Mounting configfs... done
  Starting ccsd... done
  Starting cman... done
  Starting daemons... done
  Starting fencing...

After 5 minutes the task finishes with "done" but clustat says

==== As root on web01.company.com ====
 Cluster Status for GFSCluster @ Wed Jul  8 01:00:24 2009
 Member Status: Quorate

  Member Name                             ID   Status
  ------ ----                             ---- ------
  node01.company.com                         1 Online, Local
  node02.company.com                         2 Offline


==== As root on web02.company.com ====
 Cluster Status for GFSCluster @ Wed Jul  8 01:00:26 2009
 Member Status: Quorate

  Member Name                             ID   Status
  ------ ----                             ---- ------
  node01.company.com                         1 Offline
  node02.company.com                         2 Online, Local

They are both quorate with their own cluster

In the logs of web01 I found repeating messages

Jul  8 00:55:27 web01 fenced[21872]: node02.company.com not a cluster member after 6 sec post_join_delay
Jul  8 00:55:27 web01 fenced[21872]: fencing node "node02.company.com"
Jul  8 00:55:52 web01 fenced[21872]: agent "fence_ipmilan" reports: Rebooting machine @ IPMI:10.1.0.7...ipmilan: Failed to connect after 30 seconds Failed


In the logs of web02 I also found the same repeating messages

Jul  8 00:55:27 web02 fenced[6363]: node01.company.com not a cluster member after 6 sec post_join_delay
Jul  8 00:55:27 web02 fenced[6363]: fencing node "node01.company.com"
Jul  8 00:55:53 web02 fenced[6363]: agent "fence_ipmilan" reports: Rebooting machine @ IPMI:10.1.0.5...ipmilan: Failed to connect after 30 seconds Failed


Is there a bug on 5.3 with regards to clustering?
Is there any workarounds?



     Feel safer online. Upgrade to the new, safer Internet Explorer 8 optimized for Yahoo! to put your mind at peace. It's free. Get IE8 here! http://downloads.yahoo.com/sg/internetexplorer/

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux