Hello list, I'm just setting up an linux cluster with GFS and local shared discs (SAN). All the clusternodes have the same LUN presented and I did all the necessary steps to create an GFS on that disc. Its all just fine working, but when powering one of the nodes off the whole GFS hangs. No reads and no writes are possible to the filesystem. Even every process accessing the device waits for the I/O to get completed. The clusternodes are: adnux1, adnux2, adnux3, adnux4, adlade1 During the tests I'm only running the hosts: adnux2, adnux3, adnux4 While GFS is running fine, I can check the cluster with cman_tool: adnux2 / # cman_tool status Protocol version: 5.0.1 Config version: 1 Cluster name: adnuxCluster1 Cluster ID: 41625 Cluster Member: Yes Membership state: Cluster-Member Nodes: 3 Expected_votes: 1 Total_votes: 3 Quorum: 2 Active subsystems: 6 Node name: adnux2 Node addresses: 192.168.1.152 & adnux2 / # cat /proc/cluster/services Service Name GID LID State Code Fence Domain: "default" 7 5 run - [1 2 3] DLM Lock Space: "clvmd" 2 3 run - [3 1 2] DLM Lock Space: "adnux" 8 6 run - [1 3 2] GFS Mount Group: "adnux" 9 7 run - [1 3 2] When powering adnux4 off the other two hosts cannot access the GFS in any way. The file /var/log/messages from one of the nodes says: ... Mar 21 18:35:37 adnux2 kernel: CMAN: node adnux4 has been removed from the cluster : Missed too many heartbeats Mar 21 18:35:38 adnux2 fenced[5627]: adnux4 not a cluster member after 0 sec post_fail_delay Mar 21 18:35:38 adnux2 fenced[5627]: fencing node "adnux4" Mar 21 18:35:38 adnux2 fenced[5627]: fence "adnux4" failed cman_tool tells that the cluster is still up: adnux2 ~ # cman_tool status Protocol version: 5.0.1 Config version: 1 Cluster name: adnuxCluster1 Cluster ID: 41625 Cluster Member: Yes Membership state: Cluster-Member Nodes: 2 Expected_votes: 1 Total_votes: 2 Quorum: 2 Active subsystems: 6 Node name: adnux2 Node addresses: 192.168.1.152 & adnux2 ~ # cat /proc/cluster/services Service Name GID LID State Code Fence Domain: "default" 7 5 recover 2 - [1 2] DLM Lock Space: "clvmd" 2 3 recover 0 - [1 2] DLM Lock Space: "adnux" 8 6 recover 0 - [1 2] GFS Mount Group: "adnux" 9 7 recover 0 - [1 2] Even successfully running "fence_manual -n adnux4" and "fence_ack_manual -n adnux4" remains without any affect. I'm wondering why the GFS is blocking? It can not be that the failed node must be fenced in order to gfs be able to function?! I was searching for many ideas but only found some people pointing to an maybe misconfigured fencing. Now I'm hoping to find here where my mistake is. Thank you in advance. -- configuration file /etc/cluster/cluster.conf -- <?xml version="1.0"?> <cluster name="adnuxCluster1" config_version="1"> <cman expected_votes="1" quorum="1"> </cman> <clusternodes> <clusternode name="adnux1" votes="1"> <fence> <method name="single"> <device name="human" nodename="adnux1"/> </method> </fence> </clusternode> <clusternode name="adnux2" votes="1"> <fence> <method name="single"> <device name="human" nodename="adnux2"/> </method> </fence> </clusternode> <clusternode name="adnux3" votes="1"> <fence> <method name="single"> <device name="human" nodename="adnux3"/> </method> </fence> </clusternode> <clusternode name="adnux4" votes="1"> <fence> <method name="single"> <device name="human" nodename="adnux4"/> </method> </fence> </clusternode> <clusternode name="adlade1" votes="1"> <fence> <method name="single"> <device name="human" nodename="adlade1"/> </method> </fence> </clusternode> </clusternodes> <fence_devices> <device name="human" agent="fence_manual"/> </fence_devices> </cluster> -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster