Hi all,
I just performed a test which fail miserably. I have two nodes node-1 and node-2 Global file system /gfs is on node-1.
Two HA services running on node-1. If I unplug the cables for node 1 then those two services should transfers to Node-2. But node-2 did not take over the services. But if I do proper shutdown/reboot on node-1 then those two services are transferring to node-2 without problem.
Please Help!
clustat from node-2 before unplug of cable for node-1:
[root@Node-2 ~]# clustat Member Status: Quorate
Member Name ID Status ------ ---- ---- ------ Node-1 1 Online, rgmanager Node-2 2 Online, Local, rgmanager
Service Name Owner (Last) State ------- ---- ----- ------ ----- service:nfs Node-1 started service:ESS_HA Node-1 started
clustat from node-2 After unplug of cable for node-1:
[root@Node-2 ~]# clustat Member Status: Quorate
Member Name ID Status ------ ---- ---- ------ Node-1 1 Offline Node-2 2 Online, Local, rgmanager
Service Name Owner (Last) State ------- ---- ----- ------ ----- service:nfs Node-1 started service:ESS_HA Node-1 started
/etc/cluster/cluster.conf:
[root@Node-2 ~]# cat /etc/cluster/cluster.conf <?xml version="1.0"?> <cluster config_version="54" name="idm_cluster"> <fence_daemon post_fail_delay="0" post_join_delay="120"/> <clusternodes> <clusternode name="Node-1" nodeid="1" votes="1"> <fence/> </clusternode> <clusternode name="Node-2" nodeid="2" votes="1"> <fence/> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices/> <rm> <failoverdomains> <failoverdomain name="nfs" ordered="0" restricted="1"> <failoverdomainnode name="Node-1" priority="1"/> <failoverdomainnode name="Node-2" priority="1"/> </failoverdomain> </failoverdomains> <resources> <clusterfs device="/dev/vg00/mygfs" force_unmount="0" fsid="59408" fstype="gfs" mountpoint="/gfs" name="gfs" options=""/> <ip address="10.128.107.229" monitor_link="1"/> <script file="/gfs/ess_clus/HA/clusTest.sh" name="ESS_HA_test"/> <script file="/gfs/clusTest.sh" name="Clus_Test"/> </resources> <service autostart="1" name="nfs"> <clusterfs ref="gfs"/> <ip ref="10.128.107.229"/> </service> <service autostart="1" domain="nfs" name="ESS_HA" recovery="restart"> <script ref="ESS_HA_test"/> <clusterfs ref="gfs"/> <ip ref="10.128.107.229"/> </service> </rm> </cluster> [root@Node-2 ~]#
Node2: tail ?f /var/log/message
Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] CLM CONFIGURATION CHANGE Jun 29 18:20:49 vm-idm02 fenced[1706]: vm-idm01 not a cluster member after 0 sec post_fail_delay Jun 29 18:20:49 vm-idm02 kernel: dlm: closing connection to node 1 Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] New Configuration: Jun 29 18:20:49 vm-idm02 fenced[1706]: fencing node "vm-idm01" Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] r(0) ip(10.128.107.224) Jun 29 18:20:49 vm-idm02 fenced[1706]: fence "vm-idm01" failed Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] Members Left: Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] r(0) ip(10.128.107.223) Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] Members Joined: Jun 29 18:20:49 vm-idm02 openais[1690]: [SYNC ] This node is within the primary component and will provide service. Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] CLM CONFIGURATION CHANGE Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] New Configuration: Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] r(0) ip(10.128.107.224) Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] Members Left: Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] Members Joined: Jun 29 18:20:49 vm-idm02 openais[1690]: [SYNC ] This node is within the primary component and will provide service. Jun 29 18:20:49 vm-idm02 openais[1690]: [TOTEM] entering OPERATIONAL state. Jun 29 18:20:49 vm-idm02 openais[1690]: [CLM ] got nodejoin message 10.128.107.224 Jun 29 18:20:49 vm-idm02 openais[1690]: [CPG ] got joinlist message from node 2 Jun 29 18:20:54 vm-idm02 fenced[1706]: fencing node "Node-1" Jun 29 18:20:54 vm-idm02 fenced[1706]: fence "Node-1" failed Jun 29 18:20:59 vm-idm02 fenced[1706]: fencing node "Node-1"
Regards, Rahul |
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster