I've been trying to configure my gfs cluster to use the fence_sanbox2, and don't quite have it right. my configuration is pasted below. To test it, I reboot one of the machines in the cluster, but instead of getting fenced and the journal replayed, the rest of the cluster just hangs (errors below). In addition, when I telnet to the fibre switch, the port is not disabled as I thought it would be. I'm able to use fence_sanbox2 directly to disable and enable the port. So...does anybody see what I'm doing wrong? How can I debug this further? Does anything special need to be done to the init scripts to disable/reenable the port, or does the fenced take care of that? I'm using the RHEL4 branch, which I cvs updated a couple weeks back. <CONFIG FILE> <?xml version="1.0"?> <cluster name="blade_cluster" config_version="1"> <fencedevices> <fencedevice name="human" agent="fence_manual"/> <fencedevice name="san" agent="fence_sanbox2" ipaddr="128.50.18.66" login="gfs" passwd="foobar"/> </fencedevices> <fence_daemon clean_start="0"> </fence_daemon> <cman> <multicast addr="224.0.0.18"/> </cman> <clusternodes> <clusternode name="blade01" nodeid="1" votes="1"> <multicast addr="224.0.0.18" interface="eth0"/> <fence> <method name="fibre"> <device name="san" port="1"/> </method> <method name="single"> <device name="human" ipaddr="128.50.18.1"/> </method> </fence> </clusternode> <clusternode name="blade02" nodeid="2" votes="0"> <multicast addr="224.0.0.18" interface="eth0"/> <fence> <method name="fibre"> <device name="san" port="2"/> </method> <method name="single"> <device name="human" ipaddr="128.50.18.2"/> </method> </fence> </clusternode> <clusternode name="blade03" nodeid="3" votes="0"> <multicast addr="224.0.0.18" interface="eth0"/> <fence> <method name="fibre"> <device name="san" port="3"/> </method> <method name="single"> <device name="human" ipaddr="128.50.18.3"/> </method> </fence> </clusternode> </cluster> <ERRORS> CMAN: node blade04 has been removed from the cluster : Shutdown CMAN: node blade04 rejoining CMAN: removing node blade04 from the cluster : Missed too many heartbeats CMAN: node blade04 rejoining CMAN: node blade01 has been removed from the cluster : No response to messages SM: 00000001 process_recovery_barrier status=-104 -dan -- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster