Dear All, I am trying to get a nfs-ganesha ha cluster running, with 3, CentOS Linux release 7.1.1503 nodes. I use the package glusterfs-ganesha-3.7.6 -1.el7.x86_64 to get the HA scripts. So far it works fine when i stop the nfs-ganesha service on one of the node it moves the virtual ip to one of the other node, altai-dead_ip-1 resource is created properly: root@rnas2 ~# pcs status Cluster name: ganesha-cluster-dmath Last updated: Thu Nov 26 10:41:07 2015 Last change: Thu Nov 26 10:40:06 2015 by root via cibadmin on altai Stack: corosync Current DC: rnas2 (version 1.1.13-a14efad) - partition with quorum 3 nodes and 13 resources configured Online: [ altai kaukasus rnas2 ] Full list of resources: Clone Set: nfs-mon-clone [nfs-mon] Started: [ altai kaukasus rnas2 ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ altai kaukasus rnas2 ] kaukasus-cluster_ip-1 (ocf::heartbeat:IPaddr): S tarted kaukasus kaukasus-trigger_ip-1 (ocf::heartbeat:Dummy): St arted kaukasus altai-cluster_ip-1 (ocf::heartbeat:IPaddr): Star ted kaukasus altai-trigger_ip-1 (ocf::heartbeat:Dummy): Start ed kaukasus rnas2-cluster_ip-1 (ocf::heartbeat:IPaddr): Star ted rnas2 rnas2-trigger_ip-1 (ocf::heartbeat:Dummy): Start ed rnas2 altai-dead_ip-1 (ocf::heartbeat:Dummy): Started altai PCSD Status: kaukasus: Online altai: Online rnas2: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled But when i just disconnect the network on one of the node, in this case altai (or poweroff), root@altai ~# ifdown bond0 it takes down the whole cluster. I found the following message in the logs: Nov 26 10:45:05 rnas2 crmd[17255]: error: Operation nfs -grace_start_0: Timed Out (node=rnas2, call=85, timeout=40000ms) I wonder if i just misconfigured something or if this is not supported yet? below the log during the take down: Nov 26 10:44:24 rnas2 corosync[8848]: [TOTEM ] A new membership (129.132.145.5:1048) was formed. Members left: 2 Nov 26 10:44:24 rnas2 attrd[17253]: notice: crm_update_peer_proc: Node altai[2] - state is now lost (was member) Nov 26 10:44:24 rnas2 attrd[17253]: notice: Removing all altai attributes for attrd_peer_change_cb Nov 26 10:44:25 rnas2 corosync[8848]: [QUORUM] Members[2]: 1 3 Nov 26 10:44:25 rnas2 corosync[8848]: [MAIN ] Completed service synchronization, ready to provide service. Nov 26 10:44:25 rnas2 cib[17250]: notice: crm_update_peer_proc: Node altai[2] - state is now lost (was member) Nov 26 10:44:25 rnas2 cib[17250]: notice: Removing altai/2 from the membership list Nov 26 10:44:25 rnas2 cib[17250]: notice: Purged 1 peers with id=2 and/or uname=altai from the membership cache Nov 26 10:44:25 rnas2 pacemakerd[17249]: notice: Node altai[2] - state is now lost (was member) Nov 26 10:44:25 rnas2 crmd[17255]: notice: Node altai[2] - state is now lost (was member) Nov 26 10:44:25 rnas2 stonith-ng[17251]: notice: crm_update_peer_proc: Node altai[2] - state is now lost (was member) Nov 26 10:44:25 rnas2 crmd[17255]: warning: No match for shutdown action on 2 Nov 26 10:44:25 rnas2 attrd[17253]: notice: Removing altai/2 from the membership list Nov 26 10:44:25 rnas2 crmd[17255]: notice: Stonith/shutdown of altai not matched Nov 26 10:44:25 rnas2 stonith-ng[17251]: notice: Removing altai/2 from the membership list Nov 26 10:44:25 rnas2 attrd[17253]: notice: Purged 1 peers with id=2 and/or uname=altai from the membership cache Nov 26 10:44:25 rnas2 crmd[17255]: notice: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Nov 26 10:44:25 rnas2 stonith-ng[17251]: notice: Purged 1 peers with id=2 and/or uname=altai from the membership cache Nov 26 10:44:25 rnas2 crmd[17255]: warning: No match for shutdown action on 2 Nov 26 10:44:25 rnas2 crmd[17255]: notice: Stonith/shutdown of altai not matched Nov 26 10:44:25 rnas2 pengine[17254]: notice: Restart nfs -grace:0 (Started kaukasus) Nov 26 10:44:25 rnas2 pengine[17254]: notice: Restart nfs -grace:1 (Started rnas2) Nov 26 10:44:25 rnas2 pengine[17254]: notice: Restart kaukasus -cluster_ip-1 (Started kaukasus) Nov 26 10:44:25 rnas2 pengine[17254]: notice: Start altai -cluster_ip-1 (kaukasus) Nov 26 10:44:25 rnas2 pengine[17254]: notice: Start altai -trigger_ip-1 (kaukasus) Nov 26 10:44:25 rnas2 pengine[17254]: notice: Restart rnas2 -cluster_ip-1 (Started rnas2) Nov 26 10:44:25 rnas2 pengine[17254]: notice: Calculated Transition 85: /var/lib/pacemaker/pengine/pe-input-86.bz2 Nov 26 10:44:25 rnas2 crmd[17255]: notice: Initiating action 29: stop kaukasus-cluster_ip-1_stop_0 on kaukasus Nov 26 10:44:25 rnas2 crmd[17255]: notice: Initiating action 35: start altai-trigger_ip-1_start_0 on kaukasus Nov 26 10:44:25 rnas2 crmd[17255]: notice: Initiating action 37: stop rnas2-cluster_ip-1_stop_0 on rnas2 (local) Nov 26 10:44:25 rnas2 crmd[17255]: notice: Initiating action 36: monitor altai-trigger_ip-1_monitor_10000 on kaukasus Nov 26 10:44:25 rnas2 IPaddr(rnas2-cluster_ip-1)[30797]: INFO: IP status = ok, IP_CIP= Nov 26 10:44:25 rnas2 crmd[17255]: notice: Operation rnas2 -cluster_ip-1_stop_0: ok (node=rnas2, call=82, rc=0, cib-update=210, confirmed=true) Nov 26 10:44:25 rnas2 crmd[17255]: notice: Initiating action 21: stop nfs-grace_stop_0 on kaukasus Nov 26 10:44:25 rnas2 crmd[17255]: notice: Initiating action 23: stop nfs-grace_stop_0 on rnas2 (local) Nov 26 10:44:25 rnas2 crmd[17255]: notice: Operation nfs -grace_stop_0: ok (node=rnas2, call=84, rc=0, cib-update=211, confirmed=true) Nov 26 10:44:25 rnas2 crmd[17255]: notice: Initiating action 22: start nfs-grace_start_0 on kaukasus Nov 26 10:44:25 rnas2 crmd[17255]: notice: Initiating action 24: start nfs-grace_start_0 on rnas2 (local) Nov 26 10:44:26 rnas2 ntpd[1700]: Deleting interface #27 bond0, 129.132.145.23#123, interface stats: received=0, sent=0, dropped=0, active_time=69258 secs Nov 26 10:45:05 rnas2 lrmd[17252]: warning: nfs-grace_start_0 process (PID 30810) timed out Nov 26 10:45:05 rnas2 lrmd[17252]: warning: nfs -grace_start_0:30810 - timed out after 40000ms Nov 26 10:45:05 rnas2 crmd[17255]: error: Operation nfs -grace_start_0: Timed Out (node=rnas2, call=85, timeout=40000ms) Nov 26 10:45:05 rnas2 crmd[17255]: warning: Action 24 (nfs -grace_start_0) on rnas2 failed (target: 0 vs. rc: 1): Error Nov 26 10:45:05 rnas2 crmd[17255]: notice: Transition aborted by nfs-grace_start_0 'modify' on rnas2: Event failed (magic=2:1;24:85:0:836713e1-c9d3-43f8-bffd-756e023eee8a,...event:381, 0) Nov 26 10:45:05 rnas2 crmd[17255]: warning: Action 24 (nfs -grace_start_0) on rnas2 failed (target: 0 vs. rc: 1): Error Nov 26 10:45:05 rnas2 crmd[17255]: warning: Action 22 (nfs -grace_start_0) on kaukasus failed (target: 0 vs. rc: 1): Error Nov 26 10:45:05 rnas2 crmd[17255]: warning: Action 22 (nfs -grace_start_0) on kaukasus failed (target: 0 vs. rc: 1): Error Nov 26 10:45:05 rnas2 crmd[17255]: notice: Transition 85 (Complete=13, Pending=0, Fired=0, Skipped=3, Incomplete=8, Source=/var/lib/pacemaker/pengine/pe-input-86.bz2): Stopped Nov 26 10:45:05 rnas2 pengine[17254]: warning: Processing failed op start for nfs-grace:0 on kaukasus: unknown error (1) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Processing failed op start for nfs-grace:0 on kaukasus: unknown error (1) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Processing failed op start for nfs-grace:1 on rnas2: unknown error (1) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Processing failed op start for nfs-grace:1 on rnas2: unknown error (1) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from rnas2 after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from rnas2 after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from rnas2 after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Recover nfs -grace:0 (Started kaukasus) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Stop nfs -grace:1 (rnas2) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Start kaukasus -cluster_ip-1 (kaukasus) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Start altai -cluster_ip-1 (kaukasus) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Start rnas2 -cluster_ip-1 (rnas2) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Calculated Transition 86: /var/lib/pacemaker/pengine/pe-input-87.bz2 Nov 26 10:45:05 rnas2 pengine[17254]: warning: Processing failed op start for nfs-grace:0 on kaukasus: unknown error (1) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Processing failed op start for nfs-grace:0 on kaukasus: unknown error (1) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Processing failed op start for nfs-grace:1 on rnas2: unknown error (1) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Processing failed op start for nfs-grace:1 on rnas2: unknown error (1) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from kaukasus after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from kaukasus after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from kaukasus after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from rnas2 after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from rnas2 after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: warning: Forcing nfs -grace-clone away from rnas2 after 1000000 failures (max=1000000) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Stop nfs -grace:0 (kaukasus) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Stop nfs -grace:1 (rnas2) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Start kaukasus -cluster_ip-1 (kaukasus - blocked) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Start altai -cluster_ip-1 (kaukasus - blocked) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Start rnas2 -cluster_ip-1 (rnas2 - blocked) Nov 26 10:45:05 rnas2 pengine[17254]: notice: Calculated Transition 87: /var/lib/pacemaker/pengine/pe-input-88.bz2 Nov 26 10:45:05 rnas2 crmd[17255]: notice: Initiating action 2: stop nfs-grace_stop_0 on kaukasus Nov 26 10:45:05 rnas2 crmd[17255]: notice: Initiating action 6: stop nfs-grace_stop_0 on rnas2 (local) Nov 26 10:45:05 rnas2 crmd[17255]: notice: Operation nfs -grace_stop_0: ok (node=rnas2, call=86, rc=0, cib-update=218, confirmed=true) Nov 26 10:45:05 rnas2 crmd[17255]: notice: Transition 87 (Complete=5, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-88.bz2): Complete Nov 26 10:45:05 rnas2 crmd[17255]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Yours, Rigi _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users