Another thing I am a bit confused by. After hitting the rm hang describe before, I expected that reset one of the nodes of the cluster would clear up the problem since recovery should clean up the DLM lock state. So I reset cl031. cl030 still had the gfs file system mounted and cl032 was a member of the cluster, but did not a gfs file system mounted. When I reset cl031, both other nodes printed CMAN: no HELLO from cl031a, removing from the cluster Since I had configured manual fencing, I expected that I would see a message on one of the nodes saying I needed to ack the fencing, but I never saw any message. After that running, cat /proc/cluster/services hung. I reset cl031 and cl032 got: CMAN: no HELLO from cl030a, removing from the cluster CMAN: quorum lost, blocking activity SM: 00000001 process_recovery_barrier status=-104 Does the SM: message mean anything. After rebooting the other 2 nodes, they rejoined the cluster ok, but there were message vi /var/log/messages : Dec 3 16:53:44 cl032 fenced[17168]: fencing node "cl030a" Dec 3 16:53:44 cl032 fenced[17168]: fence "cl030a" failed So I'm not sure my manual fencing is working correctly. Any suggestions? Thanks, Daniel