(resending since forgot to include linux-cluster@xxxxxxxxxx) I am using manual fencing with gnbd fencing. Here is the tail on /var/proc/messages: Aug 28 14:17:06 bof227 fenced[2497]: bof226 not a cluster member after 0 sec post_fail_delay Aug 28 14:17:06 bof227 kernel: CMAN: removing node bof226 from the cluster : Missed too many heartbeats Aug 28 14:17:06 bof227 fenced[2497]: fencing node "bof226" Aug 28 14:17:06 bof227 fence_manual: Node bof226 needs to be reset before recovery can procede. Waiting for bof226 to rejoin the cluster or for manual acknowledgement that it has been reset (i.e. fence_ack_manual -n bof226) ************************ cluster.conf <?xml version="1.0"?> <cluster config_version="84" name="MZ_CLUSTER"> <fence_daemon post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="bof227" votes="1"> <fence> <method name="1"> <device name="device_MF_227" nodename="bof227"/> <device name="gnbd_server_bof226" nodename="bof227"/> </method> </fence> </clusternode> <clusternode name="bof226" votes="1"> <fence> <method name="1"> <device name="device_MF_226" nodename="bof226"/> <device name="gnbd_server_bof227" nodename="bof226"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices> <fencedevice agent="fence_manual" name="device_MF_226"/> <fencedevice agent="fence_manual" name="device_MF_227"/> <fencedevice agent="fence_gnbd" name="gnbd_server_bof226" servers="bof226"/> <fencedevice agent="fence_gnbd" name="gnbd_server_bof227" servers="bof227"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="FD_PREF_BOF226" ordered="1" restricted="1"> <failoverdomainnode name="bof226" priority="1"/> <failoverdomainnode name="bof227" priority="2"/> </failoverdomain> <failoverdomain name="FD_PREF_BOF_227" ordered="1" restricted="1"> <failoverdomainnode name="bof227" priority="1"/> <failoverdomainnode name="bof226" priority="2"/> </failoverdomain> </failoverdomains> <resources/> </rm> </cluster> -----Original Message----- From: David Teigland [mailto:teigland@xxxxxxxxxx] Sent: Monday, August 28, 2006 2:36 PM To: Zelikov, Mikhail Cc: linux-cluster@xxxxxxxxxx Subject: Re: DLM locks with 1 node on 2 node cluster It's trying to fence the failed node and won't continue with recovery until that's done. What fencing method are you using in cluster.conf? Are there any fencing error messages in /var/log/messages? What does your cluster.conf look like? Dave -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster