See below;
Gary Romo
IBM Global Technology Services
303.458.4415
Email: garromo@xxxxxxxxxx
Pager:1.877.552.9264
Text message: gromo@xxxxxxxxxx
jim parsons <jparsons@xxxxxxxxxx>
Sent by: linux-cluster-bounces@xxxxxxxxxx 01/17/2008 03:40 PM
|
|
On Thu, 2008-01-17 at 14:06 -0700, Gary Romo wrote:
>
> I enabled telnet on the MM, now I am getting these messsages;
>
> Jan 17 14:00:24 node1 fenced[3229]: fence "node2" failed
> Jan 17 14:00:29 node1 fenced[3229]: fencing node "node2"
> Jan 17 14:00:40 node1 fenced[3229]: agent "fence_bladecenter" reports:
> pattern match timed-out at /sbin/fence_bladecenter line 189
>
> Jan 17 14:00:40 node1 fenced[3229]: fence "node2" failed
> Jan 17 14:00:45 node1 fenced[3229]: fencing node "node2"
> Jan 17 14:00:56 node1 fenced[3229]: agent "fence_bladecenter" reports:
> pattern match timed-out at /sbin/fence_bladecenter line 189
>
> Jan 17 14:00:56 node1 fenced[3229]: fence "node2" failed
> Jan 17 14:01:01 node1 fenced[3229]: fencing node "node2"
> Jan 17 14:01:12 node1 fenced[3229]: agent "fence_bladecenter" reports:
> pattern match timed-out at /sbin/fence_bladecenter line 189
>
> Line 189 looks like this;
>
> ($text, $match) = $t->waitfor("/system:blade\\[$bladenum\\]>/");
>
>
> I am getting these on thesecond node;
>
> Jan 17 14:03:24 mode2 fenced[3340]: fence "node1" failed
> Jan 17 14:03:29 node2 fenced[3340]: fencing node "node1"
> Jan 17 14:03:29 node2 fenced[3340]: fence "node1" failed
> Jan 17 14:03:34 node2 fenced[3340]: fencing node "node1"
> Jan 17 14:03:34 node2 fenced[3340]: fence "node1" failed
>
Ah, yuck. Well, let's figure out what is going on here.
Can you post the clusternodes and fencedevices sections of your
cluster.conf here? Just XXXX out any passwords.
<?xml version="1.0"?>
<cluster alias="rhcs-1-clus" config_version="4" name="rhcs-1-clus">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="node1" votes="1">
<multicast addr="XXX.XXX.127.204" interface="eth0"/>
<fence>
<method name="1">
<device blade="2" name="chassis_fence"/>
</method>
</fence>
</clusternode>
<clusternode name="node2" votes="1">
<multicast addr="XXX.XXX.127.204" interface="eth0"/>
<fence>
<method name="1">
<device blade="3" name="chassis_fence"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1">
<multicast addr="XXX.XXX.127.204"/>
</cman>
<fencedevices>
<fencedevice agent="fence_bladecenter" ipaddr="XXX.XXX.1.143" login="rchs_fence" name="chassis_fence" passwd="XXXXXXX"/>
</fencedevices>
On one of the cluster nodes, can you run
'/sbin/fence_bladecenter -a <ip or hostname of bladecenter> -l <login>
-p <passwd> -n <blade number of another running node> -o status -v'
[root@lxdnt648 ~]# /sbin/fence_bladecenter -a chassis -l rchs_fence -p XXXXXXX -n 2 -o status -v
Please use '-h' for usage.
Do you know firmware details about your bladecenter? The
fence_bladecenter script hasn't changed in years...The tested firmware
versions are in the top of the file. Maybe the interface has changed. If
so, the debuglog should give us information.
1 | chassis | Main application | BRET85M | CNETMNUS.PKT | 01-10-07 |
16
|
Boot ROM* | BRBR82A | CNETBRUS.PKT | 06-01-05 |
16
| ||
Remote control | BRRG85M | CNETRGUS.PKT | 01-10-07 |
16 |
This will get us started.
-Jim
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster