Hi All,
We are using a two node cluster using centOS 5.5 on HP Proliant Servers . One of the our servers in the cluster failed and is not booting up . There was a hardware issue on the server , so we removed the node from the cluster and also removed it from fencing domain.
Now our cluster service is now running on a single node cluster . After resolving the hardware issue ,
we are now trying to add back the node to the cluster using the luci interface but it is failing with the following error
"Host already member of cluster testcluster"
whereas the host information is not available in cluster config file.
# grep host2 /etc/cluster/cluster.conf
Restart of cman on the node is showing the following error
====
# service cman restart
Stopping cluster:
Stopping fencing... done
Stopping cman... done
Stopping ccsd... done
Unmounting configfs... done
[ OK ]
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... failed
cman not started: Can't find local node name in cluster.conf /usr/sbin/cman_tool: aisexec daemon didn't start
[FAILED]
Stopping cluster:
Stopping fencing... done
Stopping cman... done
Stopping ccsd... done
Unmounting configfs... done
[ OK ]
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... failed
cman not started: Can't find local node name in cluster.conf /usr/sbin/cman_tool: aisexec daemon didn't start
[FAILED]
--------------
# tail -f /var/log/messages
Jul 29 00:04:16 host2 ccsd[2793]: Local version # : 31
Jul 29 00:04:16 host2 ccsd[2793]: Remote version #: 31
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] AIS Executive Service RELEASE 'subrev 1887 version 0.80.6'
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] AIS Executive Service: started and ready to provide service.
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] local node name "host2.example.com" not found in cluster.conf
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] Error reading CCS info, cannot start
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] Error reading config from CCS
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] AIS Executive exiting (reason: could not read the main configuration file).
=======
Jul 29 00:04:16 host2 ccsd[2793]: Local version # : 31
Jul 29 00:04:16 host2 ccsd[2793]: Remote version #: 31
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] AIS Executive Service RELEASE 'subrev 1887 version 0.80.6'
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] AIS Executive Service: started and ready to provide service.
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] local node name "host2.example.com" not found in cluster.conf
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] Error reading CCS info, cannot start
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] Error reading config from CCS
Jul 29 00:04:16 host2 openais[2799]: [MAIN ] AIS Executive exiting (reason: could not read the main configuration file).
=======
Please suggest how to add the failed node back to the cluster ?
Thanks in Advance
Zaman
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster