Matt
mbrookov@xxxxxxxxx
On Tue, 2004-12-14 at 12:18, Daniel McNeil wrote:
I was running my test last night and I got an i/o error from the disk subsystem that caused one of the nodes to panic. The other 2 nodes removed the dead node from membership, but the the fencing did not work. cl030 /var/log/messages: Dec 13 21:54:26 cl030 kernel: CMAN: no HELLO from cl032a, removing from the cluster Dec 13 21:54:27 cl030 fenced[12121]: fencing node "cl032a" Dec 13 21:54:27 cl030 fenced[12121]: fence "cl032a" failed Dec 13 21:54:28 cl030 fenced[12121]: fencing node "cl032a" Dec 13 21:54:28 cl030 fenced[12121]: fence "cl032a" failed Dec 13 21:54:29 cl030 fenced[12121]: fencing node "cl032a" This goes on all night.. cl031 /var/log/messagew: Dec 13 21:54:27 cl031 fenced[11850]: fencing deferred to 1 [root@cl030 root]# fence_ack_manual -s cl032a Warning: If the node "cl032a" has not been manually fenced (i.e. power cycled or disconnected from shared storage devices) the GFS file system may become corrupted and all its data unrecoverable! Please verify that the node shown above has been reset or disconnected from storage. Are you certain you want to continue? [yN] y can't open /tmp/fence_manual.fifo: No such file or directory I've attached my cluster.conf file. Do I have fencing set up correctly. Any ideas on why fenced is failing to fence? Thanks, Daniel
-- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster