Hi all Here is my cluster.conf: <?xml version="1.0"?> <cluster config_version="6" name="TEST"> <quorumd device="/dev/vg1quorom/lv1quorom" interval="1" label="quorum" min_score="3" tko="10" votes="3"> <heuristic interval="2" program="/usr/sbin/qdiskd" score="1"/> </quorumd> <fence_daemon post_fail_delay="0" post_join_delay="3"/> <cman expected_votes="1" two_node="1"/> <clusternodes> <clusternode name="node1" nodeid="1" votes="1"> <fence> <method name="1"> <device name="RSA_node1"/> </method> </fence> </clusternode> <clusternode name="node2" nodeid="2" votes="1"> <fence> <method name="1"> <device name="RSA_node2"/> </method> </fence> </clusternode> </clusternodes> <cman/> <fencedevices> <fencedevice agent="fence_rsa" ipaddr="RSA_node1" login="USER" name="RSA_node1" passwd="PASSWORD"/> <fencedevice agent="fence_rsa" ipaddr="RSA_node2" login="USER" name="RSA_node2" passwd="PASSWORD"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="TEST" ordered="1" restricted="1"> <failoverdomainnode name="node1" priority="1"/> <failoverdomainnode name="node2" priority="2"/> </failoverdomain> </failoverdomains> <resources> <ip address="172.28.104.80" monitor_link="1"/> <clusterfs device="/dev/vg1data/lv1data" force_unmount="0" fsid="30516" fstype="gfs2" mountpoint="/data" name="DATA" options=""/> </resources> <service autostart="1" domain="TEST" exclusive="1" name="TEST"> <ip ref="172.28.104.80"> <clusterfs ref="DATA"/> </ip> </service> </rm> </cluster> N.B node1, node2 , RSA_node1 and RSA_node2 are set in /etc/hosts When I move service from node1 to node2 (by a force reboot on node1), it fails (because of probably a network problem) but is there a timeout ? If node2 can't connect to rsa node1, why it doesnt consider that node1is "dead" and why service doesn't go on node2 ? Here is the clustat [root@node2 ~]# clustat Cluster Status for TEST @ Mon Mar 8 11:33:32 2010 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Offline node2 2 Online, Local, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:TEST node1 stopping It's stopping like that since 30min ! Here is the log: Mar 8 11:35:45 node2 fenced[7038]: agent "fence_rsa" reports: Unable to connect/login to fencing device Mar 8 11:35:45 node2 fenced[7038]: fence "node1" failed Mar 8 11:35:50 node2 fenced[7038]: fencing node "node1" Mar 8 11:35:56 node2 fenced[7038]: agent "fence_rsa" reports: Unable to connect/login to fencing device Mar 8 11:35:56 node2 fenced[7038]: fence "node1" failed Why node2 is still trying to fence node1 ? Here is something else : [root@node2 ~]# cman_tool services type level name id state fence 0 default 00010001 FAIL_START_WAIT [2] dlm 1 rgmanager 00020001 FAIL_ALL_STOPPED [1 2] How to verify quorum is used ? Last question : I have 3 networks (6 nic, 3 bonding), one is dedicated for heartbeat. where I have to set it in cluster.conf ? I would like node1 and node2 communicate by their own bond3 . Thanks for your help. mog -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster