Fence manual setup simply waits until either
1) the user reboots the failed node _and_ uses fence_ack_manaul to
notify the node asking for the fence that you have done so.
or
2) the node that "failed" comes back up
In the steps you described, you never acknowledged the request for
fencing - hence, you have to wait for the machine to come back up.
brassow
BTW, i'd never use manual fencing in production.
On Apr 3, 2006, at 5:30 AM, Thai Duong wrote:
Hi all,
I have a 2 node GFS 6.1 cluster with the following configuration:
<?xml version="1.0"?>
<cluster name="fccrac" config_version="5">
<cman two_node="1" expected_votes="1">
</cman>
<clusternodes>
<clusternode name="fcc1" votes="1">
<fence>
<method name="single">
<device name="human" nodename="fcc1"/>
</method>
</fence>
</clusternode>
<clusternode name="fcc4" votes="1">
<fence>
<method name="single">
<device name="human" nodename="fcc4"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fence_devices>
<fence_device name="human" agent="fence_manual"/>
</fence_devices>
</cluster>
It turns out that manual fencing doest work as expected. When I force
power down a node, the other could not fence it and worse, the whole
GFS file system is freeze waiting for the downed node to be up again.
I got something like below in kernel log
Apr 2 16:46:28 fcc1 fenced[3444]: fencing node "fcc4"
Apr 2 16:46:28 fcc1 fenced[3444]: fence "fcc4" failed
Some information about GFS and kernel:
[root@fcc1 ~]# rpm -qa | grep GFS
GFS-6.1.3-0
GFS-kernel-2.6.9-45.0.2
[root@fcc1 ~]# uname -a
Linux fcc1 2.6.9-22.0.2.EL #1 SMP Thu Jan 5 17:04:58 EST 2006 ia64
ia64 ia64 GNU/Linux
Please help.
TIA,
Thai Duong.
--
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster