On 09/17/2012 06:07 PM, Ben .T.George wrote:
Hi
My cluster is failing to start.
if i check clustat on node1, status is showing node1 online and node2
offline. If the check clustat on node2, node2 is showing online and
node1 is offline
i checked logs.fanced is throwing errors.how can i rectify this
Sep 17 23:24:54 fenced fencing node cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> still retrying
Sep 17 23:55:06 fenced fencing node cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> still retrying
Sep 18 00:25:19 fenced fencing node cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> still retrying
Sep 18 00:55:03 fenced fenced 3.0.12.1 started
Sep 18 00:55:03 fenced failed to get dbus connection
Sep 18 00:55:55 fenced fencing node cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net>
Sep 18 00:55:55 fenced fence cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> dev 0.0 agent none result: error
no method
Sep 18 00:55:55 fenced fence cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> failed
Sep 18 00:55:58 fenced fencing node cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net>
Sep 18 00:55:58 fenced fence cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> dev 0.0 agent none result: error
no method
Sep 18 00:55:58 fenced fence cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> failed
Sep 18 00:56:01 fenced fencing node cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net>
Sep 18 00:56:01 fenced fence cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> dev 0.0 agent none result: error
no method
Sep 18 00:56:01 fenced fence cgceccprd1.combinedgroup.net
<http://cgceccprd1.combinedgroup.net> failed
please help me solve this issue
Regards,
Ben
What is your cluster.conf?
likely you either have no fencing configured, or your fencing is not
working. Either way, failing to fence is a critical problem and the
cluster will hang, just as you're seeing here. This is by design. Better
to hang a cluster than to corrupt it.
digimer
--
Digimer
Papers and Projects: https://alteeve.ca
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster