Digimer <lists@xxxxxxxxxx> writes: > Can you paste your full pacemaker config and the logs from the other > nodes starting just before the lost node went away? Sorry, I forgot to attach it:
node nebula1 node nebula2 node nebula3 node one node quorum \ attributes standby="on" primitive ONE-Frontend ocf:heartbeat:VirtualDomain \ params config="/var/lib/one/datastores/one/one.xml" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ meta target-role="Stopped" primitive ONE-OCFS2-datastores ocf:heartbeat:Filesystem \ params device="/dev/one-fs/datastores" directory="/var/lib/one/datastores" fstype="ocfs2" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ op monitor interval="20" timeout="40" primitive ONE-vg ocf:heartbeat:LVM \ params volgrpname="one-fs" \ op start interval="0" timeout="30" \ op stop interval="0" timeout="30" \ op monitor interval="60" timeout="30" primitive Quorum-Node ocf:heartbeat:VirtualDomain \ params config="/var/lib/libvirt/qemu/pcmk/quorum.xml" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ meta target-role="Started" primitive Stonith-ONE-Frontend stonith:external/libvirt \ params hostlist="one" hypervisor_uri="qemu:///system" pcmk_host_list="one" pcmk_host_check="static-list" \ op monitor interval="30m" \ meta target-role="Started" primitive Stonith-Quorum-Node stonith:external/libvirt \ params hostlist="quorum" hypervisor_uri="qemu:///system" pcmk_host_list="quorum" pcmk_host_check="static-list" \ op monitor interval="30m" \ meta target-role="Started" primitive Stonith-nebula1-IPMILAN stonith:external/ipmi \ params hostname="nebula1-ipmi" ipaddr="A.B.C.D" interface="lanplus" userid="user" passwd="XXXXX" passwd_method="env" priv="operator" pcmk_host_list="nebula1" pcmk_host_check="static-list" priority="10" \ op monitor interval="30m" \ meta target-role="Started" primitive Stonith-nebula2-IPMILAN stonith:external/ipmi \ params hostname="nebula2-ipmi" ipaddr="A.B.C.D" interface="lanplus" userid="user" passwd="XXXXX" passwd_method="env" priv="operator" pcmk_host_list="nebula2" pcmk_host_check="static-list" priority="20" \ op monitor interval="30m" \ meta target-role="Started" primitive Stonith-nebula3-IPMILAN stonith:external/ipmi \ params hostname="nebula3-ipmi" ipaddr="A.B.C.D" interface="lanplus" userid="user" passwd="XXXXX" passwd_method="env" priv="operator" pcmk_host_list="nebula3" pcmk_host_check="static-list" priority="30" \ op monitor interval="30m" \ meta target-role="Started" primitive clvm ocf:lvm2:clvm \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op monitor interval="60" timeout="90" primitive dlm ocf:pacemaker:controld \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ op monitor interval="60" timeout="60" primitive o2cb ocf:pacemaker:o2cb \ params stack="pcmk" daemon_timeout="30" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ op monitor interval="60" timeout="60" group ONE-Storage dlm o2cb clvm ONE-vg ONE-OCFS2-datastores clone ONE-Storage-Clone ONE-Storage \ meta interleave="true" target-role="Started" location Nebula1-does-not-fence-itslef Stonith-nebula1-IPMILAN \ rule $id="Nebula1-does-not-fence-itslef-rule" inf: #uname ne nebula1 location Nebula2-does-not-fence-itslef Stonith-nebula2-IPMILAN \ rule $id="Nebula2-does-not-fence-itslef-rule" inf: #uname ne nebula2 location Nebula3-does-not-fence-itslef Stonith-nebula3-IPMILAN \ rule $id="Nebula3-does-not-fence-itslef-rule" inf: #uname ne nebula3 location Nodes-with-ONE-Storage ONE-Storage-Clone \ rule $id="Nodes-with-ONE-Storage-rule" inf: #uname eq nebula1 or #uname eq nebula2 or #uname eq nebula3 or #uname eq one location ONE-Fontend-fenced-by-hypervisor Stonith-ONE-Frontend \ rule $id="ONE-Fontend-fenced-by-hypervisor-rule" inf: #uname ne quorum or #uname ne one location ONE-Frontend-run-on-hypervisor ONE-Frontend \ rule $id="ONE-Frontend-run-on-hypervisor-rule" 40: #uname eq nebula1 \ rule $id="ONE-Frontend-run-on-hypervisor-rule-0" 30: #uname eq nebula2 \ rule $id="ONE-Frontend-run-on-hypervisor-rule-1" 20: #uname eq nebula3 location Quorum-Node-fenced-by-hypervisor Stonith-Quorum-Node \ rule $id="Quorum-Node-fenced-by-hypervisor-rule" inf: #uname ne quorum or #uname ne one location Quorum-Node-run-on-nebula3 Quorum-Node inf: nebula3 colocation Fence-ONE-Frontend-Locally inf: Stonith-ONE-Frontend ONE-Frontend colocation Fence-Quorum-Node-Locally inf: Stonith-Quorum-Node Quorum-Node colocation Frontend-without-Quorum -inf: ONE-Frontend Quorum-Node order Fence-before-ONE-Frontend inf: Stonith-ONE-Frontend ONE-Frontend order Fence-before-Quorum-Node inf: Stonith-Quorum-Node Quorum-Node order Frontend-after-Storage inf: ONE-Storage-Clone ONE-Frontend property $id="cib-bootstrap-options" \ dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ cluster-infrastructure="openais" \ expected-quorum-votes="5" \ stonith-enabled="true" \ last-lrm-refresh="1412337152" \ stonith-timeout="30" \ symmetric-cluster="false" rsc_defaults $id="rsc-options" \ resource-stickiness="100"
Here are the logs on the 3 hypervisors, note that pacemaker does not start at bootime:
Oct 3 14:47:14 nebula1 pacemakerd: [3899]: notice: update_node_processes: 0xad17f0 Node 1172809920 now known as one, was: Oct 3 14:47:14 nebula1 stonith-ng: [3904]: info: crm_new_peer: Node one now has id: 1172809920 Oct 3 14:47:14 nebula1 stonith-ng: [3904]: info: crm_new_peer: Node 1172809920 is now known as one Oct 3 14:47:14 nebula1 crmd: [3908]: notice: crmd_peer_update: Status update: Client one/crmd now has status [online] (DC=nebula3) Oct 3 14:47:14 nebula1 crmd: [3908]: notice: do_state_transition: State transition S_NOT_DC -> S_PENDING [ input=I_JOIN_OFFER cause=C_HA_MESSAGE origin=route_message ] Oct 3 14:47:14 nebula1 crmd: [3908]: info: update_dc: Set DC to nebula3 (3.0.6) Oct 3 14:47:17 nebula1 crmd: [3908]: notice: do_state_transition: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ] Oct 3 14:47:17 nebula1 attrd: [3906]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Oct 3 14:47:17 nebula1 attrd: [3906]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Oct 3 14:47:23 nebula1 kernel: [ 580.785037] dlm: connecting to 1172809920 Oct 3 14:48:20 nebula1 ocfs2_controld: kill node 1172809920 - ocfs2_controld PROCDOWN Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: initiate_remote_stonith_op: Initiating remote operation off for one: e2683312-e06f-44fe-8d65-852a918b7a3c Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: can_fence_host_with_device: Stonith-ONE-Frontend can fence one: static-list Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: can_fence_host_with_device: Stonith-ONE-Frontend can fence one: static-list Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: can_fence_host_with_device: Stonith-ONE-Frontend can fence one: static-list Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: call_remote_stonith: Requesting that nebula1 perform op off one Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: can_fence_host_with_device: Stonith-ONE-Frontend can fence one: static-list Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: stonith_fence: Found 1 matching devices for 'one' Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: stonith_command: Processed st_fence from nebula1: rc=-1 Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: can_fence_host_with_device: Stonith-ONE-Frontend can fence one: static-list Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: stonith_fence: Found 1 matching devices for 'one' Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: stonith_command: Processed st_fence from nebula2: rc=-1 Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: can_fence_host_with_device: Stonith-ONE-Frontend can fence one: static-list Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: stonith_fence: Found 1 matching devices for 'one' Oct 3 14:48:20 nebula1 stonith-ng: [3904]: info: stonith_command: Processed st_fence from nebula3: rc=-1 Oct 3 14:48:21 nebula1 kernel: [ 638.140337] device one-dmz-pub left promiscuous mode Oct 3 14:48:21 nebula1 kernel: [ 638.188351] device one-admin left promiscuous mode Oct 3 14:48:21 nebula1 ovs-vsctl: 00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port one-dmz-pub Oct 3 14:48:21 nebula1 ovs-vsctl: 00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port one-admin Oct 3 14:48:21 nebula1 external/libvirt[5925]: notice: Domain one was stopped Oct 3 14:48:22 nebula1 stonith-ng: [3904]: notice: log_operation: Operation 'off' [5917] (call 0 from 83164f01-342e-4838-a640-ef55c7905465) for host 'one' with device 'Stonith-ONE-Frontend' returned: 0 Oct 3 14:48:22 nebula1 stonith-ng: [3904]: info: log_operation: Stonith-ONE-Frontend: Performing: stonith -t external/libvirt -T off one Oct 3 14:48:22 nebula1 stonith-ng: [3904]: info: log_operation: Stonith-ONE-Frontend: success: one 0 Oct 3 14:48:22 nebula1 external/libvirt[5967]: notice: Domain one is already stopped Oct 3 14:48:22 nebula1 ntpd[3399]: Deleting interface #8 one-admin, fe80::fc54:ff:fe6e:bfdc#123, interface stats: received=0, sent=0, dropped=0, active_time=229 secs Oct 3 14:48:22 nebula1 ntpd[3399]: Deleting interface #7 one-dmz-pub, fe80::fc54:ff:fe9e:c8e3#123, interface stats: received=0, sent=0, dropped=0, active_time=229 secs Oct 3 14:48:22 nebula1 ntpd[3399]: peers refreshed Oct 3 14:48:23 nebula1 stonith-ng: [3904]: notice: log_operation: Operation 'off' [5959] (call 0 from a5074eb0-6afa-4060-b3b9-d05e846e0c57) for host 'one' with device 'Stonith-ONE-Frontend' returned: 0 Oct 3 14:48:23 nebula1 stonith-ng: [3904]: info: log_operation: Stonith-ONE-Frontend: Performing: stonith -t external/libvirt -T off one Oct 3 14:48:23 nebula1 stonith-ng: [3904]: info: log_operation: Stonith-ONE-Frontend: success: one 0 Oct 3 14:48:23 nebula1 external/libvirt[5989]: notice: Domain one is already stopped Oct 3 14:48:23 nebula1 corosync[3674]: [TOTEM ] A processor failed, forming new configuration. Oct 3 14:48:24 nebula1 stonith-ng: [3904]: notice: log_operation: Operation 'off' [5981] (call 0 from 1fb319d9-d388-44d4-97a9-212746707e22) for host 'one' with device 'Stonith-ONE-Frontend' returned: 0 Oct 3 14:48:24 nebula1 stonith-ng: [3904]: info: log_operation: Stonith-ONE-Frontend: Performing: stonith -t external/libvirt -T off one Oct 3 14:48:24 nebula1 stonith-ng: [3904]: info: log_operation: Stonith-ONE-Frontend: success: one 0 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 22688: memb=4, new=0, lost=1 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: memb: quorum 1156032704 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: memb: nebula1 1189587136 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: memb: nebula2 1206364352 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: memb: nebula3 1223141568 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: lost: one 1172809920 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 22688: memb=4, new=0, lost=0 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: MEMB: quorum 1156032704 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: MEMB: nebula1 1189587136 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: MEMB: nebula2 1206364352 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: pcmk_peer_update: MEMB: nebula3 1223141568 Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: ais_mark_unseen_peer_dead: Node one was not seen in the previous transition Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: update_member: Node 1172809920/one is now: lost Oct 3 14:48:27 nebula1 corosync[3674]: [pcmk ] info: send_member_notification: Sending membership update 22688 to 4 children Oct 3 14:48:27 nebula1 cluster-dlm: [4127]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula1 cluster-dlm: [4127]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000000000 Oct 3 14:48:27 nebula1 corosync[3674]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 3 14:48:27 nebula1 kernel: [ 644.579862] dlm: closing connection to node 1172809920 Oct 3 14:48:27 nebula1 cib: [3903]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula1 cib: [3903]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000111312 Oct 3 14:48:27 nebula1 crmd: [3908]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula1 crmd: [3908]: info: ais_status_callback: status: one is now lost (was member) Oct 3 14:48:27 nebula1 crmd: [3908]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000111312 Oct 3 14:48:27 nebula1 stonith-ng: [3904]: notice: remote_op_done: Operation off of one by nebula1 for nebula1[83164f01-342e-4838-a640-ef55c7905465]: OK Oct 3 14:48:27 nebula1 stonith-ng: [3904]: notice: remote_op_done: Operation off of one by nebula1 for nebula2[a5074eb0-6afa-4060-b3b9-d05e846e0c57]: OK Oct 3 14:48:27 nebula1 ocfs2_controld: Could not kick node 1172809920 from the cluster Oct 3 14:48:27 nebula1 ocfs2_controld: [4180]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula1 ocfs2_controld: [4180]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000000000 Oct 3 14:48:27 nebula1 stonith-ng: [3904]: notice: remote_op_done: Operation off of one by nebula1 for nebula3[1fb319d9-d388-44d4-97a9-212746707e22]: OK Oct 3 14:48:27 nebula1 crmd: [3908]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula1: OK (ref=e2683312-e06f-44fe-8d65-852a918b7a3c) Oct 3 14:48:27 nebula1 crmd: [3908]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula2: OK (ref=443f2db0-bb48-4b1f-9179-f64cb587a22c) Oct 3 14:48:27 nebula1 crmd: [3908]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula3: OK (ref=ed21fc6f-2540-491a-8643-2d0258bf2f60) Oct 3 14:48:27 nebula1 corosync[3674]: [CPG ] chosen downlist: sender r(0) ip(192.168.231.68) ; members(old:5 left:1) Oct 3 14:48:27 nebula1 corosync[3674]: [MAIN ] Completed service synchronization, ready to provide service. Oct 3 14:48:27 nebula1 stonith-ng: [3904]: info: can_fence_host_with_device: Stonith-ONE-Frontend can fence one: static-list Oct 3 14:48:27 nebula1 stonith-ng: [3904]: info: can_fence_host_with_device: Stonith-ONE-Frontend can fence one: static-list Oct 3 14:48:27 nebula1 stonith-ng: [3904]: info: stonith_fence: Found 1 matching devices for 'one' Oct 3 14:48:27 nebula1 stonith-ng: [3904]: info: stonith_command: Processed st_fence from nebula3: rc=-1 Oct 3 14:48:27 nebula1 external/libvirt[6011]: notice: Domain one is already stopped Oct 3 14:48:29 nebula1 ovs-vsctl: 00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port nebula one-dmz-pub tag=753 -- set Interface one-dmz-pub "external-ids:attached-mac=\"52:54:00:9e:c8:e3\"" -- set Interface one-dmz-pub "external-ids:iface-id=\"049178a7-e96f-4364-be34-2ead6403347e\"" -- set Interface one-dmz-pub "external-ids:vm-id=\"a8069a7b-97fe-4122-85a3-0abbc011f540\"" -- set Interface one-dmz-pub external-ids:iface-status=active Oct 3 14:48:29 nebula1 kernel: [ 646.790675] device one-dmz-pub entered promiscuous mode Oct 3 14:48:29 nebula1 ovs-vsctl: 00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port nebula one-admin tag=702 -- set Interface one-admin "external-ids:attached-mac=\"52:54:00:6e:bf:dc\"" -- set Interface one-admin "external-ids:iface-id=\"bbbd86ac-53bc-4f60-8f84-886d9ec20996\"" -- set Interface one-admin "external-ids:vm-id=\"a8069a7b-97fe-4122-85a3-0abbc011f540\"" -- set Interface one-admin external-ids:iface-status=active Oct 3 14:48:29 nebula1 kernel: [ 646.913444] device one-admin entered promiscuous mode Oct 3 14:48:30 nebula1 external/libvirt[6011]: notice: Domain one was started Oct 3 14:48:31 nebula1 stonith-ng: [3904]: notice: log_operation: Operation 'reboot' [6003] (call 0 from 2a9f4455-6b1d-42f7-9330-2a44ff6177f0) for host 'one' with device 'Stonith-ONE-Frontend' returned: 0 Oct 3 14:48:31 nebula1 stonith-ng: [3904]: info: log_operation: Stonith-ONE-Frontend: Performing: stonith -t external/libvirt -T reset one Oct 3 14:48:31 nebula1 stonith-ng: [3904]: info: log_operation: Stonith-ONE-Frontend: success: one 0 Oct 3 14:48:31 nebula1 stonith-ng: [3904]: notice: remote_op_done: Operation reboot of one by nebula1 for nebula3[2a9f4455-6b1d-42f7-9330-2a44ff6177f0]: OK Oct 3 14:48:31 nebula1 crmd: [3908]: notice: tengine_stonith_notify: Peer one was terminated (reboot) by nebula1 for nebula3: OK (ref=23222b1a-8499-4aa0-9964-269cad2a2f9f) Oct 3 14:48:31 nebula1 crmd: [3908]: notice: do_state_transition: State transition S_NOT_DC -> S_PENDING [ input=I_JOIN_OFFER cause=C_HA_MESSAGE origin=route_message ] Oct 3 14:48:31 nebula1 crmd: [3908]: info: update_dc: Set DC to nebula3 (3.0.6) Oct 3 14:48:31 nebula1 attrd: [3906]: notice: attrd_local_callback: Sending full refresh (origin=crmd)
Oct 3 14:47:14 nebula2 crmd: [3732]: notice: crmd_peer_update: Status update: Client one/crmd now has status [online] (DC=nebula3) Oct 3 14:47:14 nebula2 crmd: [3732]: notice: do_state_transition: State transition S_NOT_DC -> S_PENDING [ input=I_JOIN_OFFER cause=C_HA_MESSAGE origin=route_message ] Oct 3 14:47:14 nebula2 crmd: [3732]: info: update_dc: Set DC to nebula3 (3.0.6) Oct 3 14:47:17 nebula2 attrd: [3730]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Oct 3 14:47:17 nebula2 attrd: [3730]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Oct 3 14:47:17 nebula2 crmd: [3732]: notice: do_state_transition: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ] Oct 3 14:47:23 nebula2 kernel: [ 576.555267] dlm: connecting to 1172809920 Oct 3 14:48:20 nebula2 ocfs2_controld: kill node 1172809920 - ocfs2_controld PROCDOWN Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: initiate_remote_stonith_op: Initiating remote operation off for one: 443f2db0-bb48-4b1f-9179-f64cb587a22c Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: can_fence_host_with_device: Stonith-nebula1-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: can_fence_host_with_device: Stonith-nebula3-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: stonith_command: Processed st_query from nebula1: rc=0 Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: can_fence_host_with_device: Stonith-nebula1-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: can_fence_host_with_device: Stonith-nebula3-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: stonith_command: Processed st_query from nebula2: rc=0 Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: can_fence_host_with_device: Stonith-nebula1-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: can_fence_host_with_device: Stonith-nebula3-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: stonith_command: Processed st_query from nebula3: rc=0 Oct 3 14:48:20 nebula2 stonith-ng: [3728]: info: call_remote_stonith: Requesting that nebula1 perform op off one Oct 3 14:48:23 nebula2 corosync[3287]: [TOTEM ] A processor failed, forming new configuration. Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 22688: memb=4, new=0, lost=1 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: memb: quorum 1156032704 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: memb: nebula1 1189587136 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: memb: nebula2 1206364352 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: memb: nebula3 1223141568 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: lost: one 1172809920 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 22688: memb=4, new=0, lost=0 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: MEMB: quorum 1156032704 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: MEMB: nebula1 1189587136 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: MEMB: nebula2 1206364352 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: pcmk_peer_update: MEMB: nebula3 1223141568 Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: ais_mark_unseen_peer_dead: Node one was not seen in the previous transition Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: update_member: Node 1172809920/one is now: lost Oct 3 14:48:27 nebula2 corosync[3287]: [pcmk ] info: send_member_notification: Sending membership update 22688 to 4 children Oct 3 14:48:27 nebula2 cluster-dlm: [4061]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula2 corosync[3287]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 3 14:48:27 nebula2 crmd: [3732]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula2 cluster-dlm: [4061]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000000000 Oct 3 14:48:27 nebula2 crmd: [3732]: info: ais_status_callback: status: one is now lost (was member) Oct 3 14:48:27 nebula2 crmd: [3732]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000111312 Oct 3 14:48:27 nebula2 cib: [3727]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula2 cib: [3727]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000111312 Oct 3 14:48:27 nebula2 kernel: [ 640.350857] dlm: closing connection to node 1172809920 Oct 3 14:48:27 nebula2 stonith-ng: [3728]: notice: remote_op_done: Operation off of one by nebula1 for nebula1[83164f01-342e-4838-a640-ef55c7905465]: OK Oct 3 14:48:27 nebula2 stonith-ng: [3728]: notice: remote_op_done: Operation off of one by nebula1 for nebula2[a5074eb0-6afa-4060-b3b9-d05e846e0c57]: OK Oct 3 14:48:27 nebula2 crmd: [3732]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula1: OK (ref=e2683312-e06f-44fe-8d65-852a918b7a3c) Oct 3 14:48:27 nebula2 stonith-ng: [3728]: notice: remote_op_done: Operation off of one by nebula1 for nebula3[1fb319d9-d388-44d4-97a9-212746707e22]: OK Oct 3 14:48:27 nebula2 crmd: [3732]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula2: OK (ref=443f2db0-bb48-4b1f-9179-f64cb587a22c) Oct 3 14:48:27 nebula2 crmd: [3732]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula3: OK (ref=ed21fc6f-2540-491a-8643-2d0258bf2f60) Oct 3 14:48:27 nebula2 ocfs2_controld: Could not kick node 1172809920 from the cluster Oct 3 14:48:27 nebula2 ocfs2_controld: [4114]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula2 ocfs2_controld: [4114]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000000000 Oct 3 14:48:27 nebula2 corosync[3287]: [CPG ] chosen downlist: sender r(0) ip(192.168.231.68) ; members(old:5 left:1) Oct 3 14:48:27 nebula2 corosync[3287]: [MAIN ] Completed service synchronization, ready to provide service. Oct 3 14:48:27 nebula2 stonith-ng: [3728]: info: can_fence_host_with_device: Stonith-nebula1-IPMILAN can not fence one: static-list Oct 3 14:48:27 nebula2 stonith-ng: [3728]: info: can_fence_host_with_device: Stonith-nebula3-IPMILAN can not fence one: static-list Oct 3 14:48:27 nebula2 stonith-ng: [3728]: info: stonith_command: Processed st_query from nebula3: rc=0 Oct 3 14:48:31 nebula2 stonith-ng: [3728]: notice: remote_op_done: Operation reboot of one by nebula1 for nebula3[2a9f4455-6b1d-42f7-9330-2a44ff6177f0]: OK Oct 3 14:48:31 nebula2 crmd: [3732]: notice: tengine_stonith_notify: Peer one was terminated (reboot) by nebula1 for nebula3: OK (ref=23222b1a-8499-4aa0-9964-269cad2a2f9f) Oct 3 14:48:31 nebula2 crmd: [3732]: notice: do_state_transition: State transition S_NOT_DC -> S_PENDING [ input=I_JOIN_OFFER cause=C_HA_MESSAGE origin=route_message ] Oct 3 14:48:31 nebula2 crmd: [3732]: info: update_dc: Set DC to nebula3 (3.0.6) Oct 3 14:48:31 nebula2 attrd: [3730]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Oct 3 14:48:31 nebula2 attrd: [3730]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Oct 3 14:47:14 nebula3 pacemakerd: [4404]: notice: update_node_processes: 0x1504fc0 Node 1172809920 now known as one, was: Oct 3 14:47:14 nebula3 stonith-ng: [4409]: info: crm_new_peer: Node one now has id: 1172809920 Oct 3 14:47:14 nebula3 stonith-ng: [4409]: info: crm_new_peer: Node 1172809920 is now known as one Oct 3 14:47:14 nebula3 crmd: [4413]: notice: crmd_peer_update: Status update: Client one/crmd now has status [online] (DC=true) Oct 3 14:47:14 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=crmd_peer_update ] Oct 3 14:47:14 nebula3 crmd: [4413]: info: abort_transition_graph: do_te_invoke:169 - Triggered transition abort (complete=1) : Peer Halt Oct 3 14:47:14 nebula3 crmd: [4413]: info: join_make_offer: Making join offers based on membership 22684 Oct 3 14:47:14 nebula3 crmd: [4413]: info: do_dc_join_offer_all: join-7: Waiting on 5 outstanding join acks Oct 3 14:47:14 nebula3 crmd: [4413]: info: update_dc: Set DC to nebula3 (3.0.6) Oct 3 14:47:16 nebula3 crmd: [4413]: info: do_dc_join_offer_all: A new node joined the cluster Oct 3 14:47:16 nebula3 crmd: [4413]: info: do_dc_join_offer_all: join-8: Waiting on 5 outstanding join acks Oct 3 14:47:16 nebula3 crmd: [4413]: info: update_dc: Set DC to nebula3 (3.0.6) Oct 3 14:47:17 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ] Oct 3 14:47:17 nebula3 crmd: [4413]: info: do_dc_join_finalize: join-8: Syncing the CIB from nebula3 to the rest of the cluster Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_sync for section 'all' (origin=local/crmd/165, version=0.1226.2): ok (rc=0) Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/166, version=0.1226.3): ok (rc=0) Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/167, version=0.1226.4): ok (rc=0) Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/168, version=0.1226.5): ok (rc=0) Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/169, version=0.1226.6): ok (rc=0) Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/170, version=0.1226.7): ok (rc=0) Oct 3 14:47:17 nebula3 crmd: [4413]: info: do_dc_join_ack: join-8: Updating node state to member for one Oct 3 14:47:17 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='one']/lrm Oct 3 14:47:17 nebula3 crmd: [4413]: info: do_dc_join_ack: join-8: Updating node state to member for quorum Oct 3 14:47:17 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='quorum']/lrm Oct 3 14:47:17 nebula3 crmd: [4413]: info: do_dc_join_ack: join-8: Updating node state to member for nebula2 Oct 3 14:47:17 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='nebula2']/lrm Oct 3 14:47:17 nebula3 crmd: [4413]: info: do_dc_join_ack: join-8: Updating node state to member for nebula1 Oct 3 14:47:17 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='nebula1']/lrm Oct 3 14:47:17 nebula3 crmd: [4413]: info: do_dc_join_ack: join-8: Updating node state to member for nebula3 Oct 3 14:47:17 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='nebula3']/lrm Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='one']/lrm (origin=local/crmd/171, version=0.1226.16): ok (rc=0) Oct 3 14:47:17 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ] Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: do_te_invoke:162 - Triggered transition abort (complete=1) : Peer Cancelled Oct 3 14:47:17 nebula3 attrd: [4411]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Oct 3 14:47:17 nebula3 attrd: [4411]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='quorum']/lrm (origin=local/crmd/173, version=0.1226.18): ok (rc=0) Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, id=Stonith-nebula3-IPMILAN_last_0, magic=0:7;26:7:7:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.18) : Resource op removal Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:276 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.1226.19) : LRM Refresh Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='nebula2']/lrm (origin=local/crmd/175, version=0.1226.20): ok (rc=0) Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, id=Stonith-nebula3-IPMILAN_last_0, magic=0:0;29:0:0:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.20) : Resource op removal Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:276 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.1226.21) : LRM Refresh Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='nebula1']/lrm (origin=local/crmd/177, version=0.1226.22): ok (rc=0) Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, id=Stonith-nebula3-IPMILAN_last_0, magic=0:7;9:1:7:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.22) : Resource op removal Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:276 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.1226.23) : LRM Refresh Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='nebula3']/lrm (origin=local/crmd/179, version=0.1226.24): ok (rc=0) Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, id=Stonith-nebula3-IPMILAN_last_0, magic=0:7;4:0:7:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.24) : Resource op removal Oct 3 14:47:17 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:276 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.1226.25) : LRM Refresh Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/181, version=0.1226.26): ok (rc=0) Oct 3 14:47:17 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/183, version=0.1226.28): ok (rc=0) Oct 3 14:47:17 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:17 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Storage-Clone from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:17 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:17 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula3-IPMILAN resources. Chose nebula2. Oct 3 14:47:17 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula2-IPMILAN resources. Chose nebula3. Oct 3 14:47:17 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula1-IPMILAN resources. Chose nebula2. Oct 3 14:47:17 nebula3 pengine: [4412]: notice: LogActions: Start dlm:3#011(one) Oct 3 14:47:17 nebula3 pengine: [4412]: notice: LogActions: Start o2cb:3#011(one) Oct 3 14:47:17 nebula3 pengine: [4412]: notice: LogActions: Start clvm:3#011(one) Oct 3 14:47:17 nebula3 pengine: [4412]: notice: LogActions: Start ONE-vg:3#011(one) Oct 3 14:47:17 nebula3 pengine: [4412]: notice: LogActions: Start ONE-OCFS2-datastores:3#011(one) Oct 3 14:47:17 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Oct 3 14:47:17 nebula3 crmd: [4413]: info: do_te_invoke: Processing graph 11 (ref=pe_calc-dc-1412340437-173) derived from /var/lib/pengine/pe-input-1429.bz2 Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 28: monitor Stonith-nebula3-IPMILAN_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 29: monitor Stonith-nebula2-IPMILAN_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 30: monitor Stonith-nebula1-IPMILAN_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 31: monitor dlm:3_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 32: monitor o2cb:3_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 33: monitor clvm:3_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 34: monitor ONE-vg:3_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 35: monitor ONE-OCFS2-datastores:3_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 36: monitor ONE-Frontend_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 37: monitor Quorum-Node_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 38: monitor Stonith-ONE-Frontend_monitor_0 on one Oct 3 14:47:17 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 39: monitor Stonith-Quorum-Node_monitor_0 on one Oct 3 14:47:17 nebula3 pengine: [4412]: notice: process_pe_message: Transition 11: PEngine Input stored in: /var/lib/pengine/pe-input-1429.bz2 Oct 3 14:47:18 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='one']/transient_attributes (origin=one/crmd/6, version=0.1226.37): ok (rc=0) Oct 3 14:47:18 nebula3 crmd: [4413]: WARN: status_from_rc: Action 36 (ONE-Frontend_monitor_0) on one failed (target: 7 vs. rc: 5): Error Oct 3 14:47:18 nebula3 crmd: [4413]: info: abort_transition_graph: match_graph_event:277 - Triggered transition abort (complete=0, tag=lrm_rsc_op, id=ONE-Frontend_last_failure_0, magic=0:5;36:11:7:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.47) : Event failed Oct 3 14:47:18 nebula3 crmd: [4413]: WARN: status_from_rc: Action 37 (Quorum-Node_monitor_0) on one failed (target: 7 vs. rc: 5): Error Oct 3 14:47:18 nebula3 crmd: [4413]: info: abort_transition_graph: match_graph_event:277 - Triggered transition abort (complete=0, tag=lrm_rsc_op, id=Quorum-Node_last_failure_0, magic=0:5;37:11:7:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.48) : Event failed Oct 3 14:47:19 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 27: probe_complete probe_complete on one - no waiting Oct 3 14:47:19 nebula3 crmd: [4413]: notice: run_graph: ==== Transition 11 (Complete=15, Pending=0, Fired=0, Skipped=12, Incomplete=1, Source=/var/lib/pengine/pe-input-1429.bz2): Stopped Oct 3 14:47:19 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ] Oct 3 14:47:19 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:19 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Storage-Clone from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:19 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:19 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on one: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:19 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on one: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:19 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula3-IPMILAN resources. Chose nebula2. Oct 3 14:47:19 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula2-IPMILAN resources. Chose nebula3. Oct 3 14:47:19 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula1-IPMILAN resources. Chose nebula2. Oct 3 14:47:19 nebula3 pengine: [4412]: notice: LogActions: Start dlm:3#011(one) Oct 3 14:47:19 nebula3 pengine: [4412]: notice: LogActions: Start o2cb:3#011(one) Oct 3 14:47:19 nebula3 pengine: [4412]: notice: LogActions: Start clvm:3#011(one) Oct 3 14:47:19 nebula3 pengine: [4412]: notice: LogActions: Start ONE-vg:3#011(one) Oct 3 14:47:19 nebula3 pengine: [4412]: notice: LogActions: Start ONE-OCFS2-datastores:3#011(one) Oct 3 14:47:19 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Oct 3 14:47:19 nebula3 crmd: [4413]: info: do_te_invoke: Processing graph 12 (ref=pe_calc-dc-1412340439-187) derived from /var/lib/pengine/pe-input-1430.bz2 Oct 3 14:47:19 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 27: probe_complete probe_complete on one - no waiting Oct 3 14:47:19 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 76: start dlm:3_start_0 on one Oct 3 14:47:19 nebula3 pengine: [4412]: notice: process_pe_message: Transition 12: PEngine Input stored in: /var/lib/pengine/pe-input-1430.bz2 Oct 3 14:47:20 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 77: monitor dlm:3_monitor_60000 on one Oct 3 14:47:20 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 78: start o2cb:3_start_0 on one Oct 3 14:47:22 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 79: monitor o2cb:3_monitor_60000 on one Oct 3 14:47:22 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 80: start clvm:3_start_0 on one Oct 3 14:47:23 nebula3 kernel: [ 585.328498] dlm: connecting to 1172809920 Oct 3 14:47:25 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 81: monitor clvm:3_monitor_60000 on one Oct 3 14:47:25 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 82: start ONE-vg:3_start_0 on one Oct 3 14:47:26 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 83: monitor ONE-vg:3_monitor_60000 on one Oct 3 14:47:26 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 84: start ONE-OCFS2-datastores:3_start_0 on one Oct 3 14:47:26 nebula3 crmd: [4413]: WARN: status_from_rc: Action 84 (ONE-OCFS2-datastores:3_start_0) on one failed (target: 0 vs. rc: 1): Error Oct 3 14:47:26 nebula3 crmd: [4413]: WARN: update_failcount: Updating failcount for ONE-OCFS2-datastores:3 on one after failed start: rc=1 (update=INFINITY, time=1412340446) Oct 3 14:47:26 nebula3 crmd: [4413]: info: abort_transition_graph: match_graph_event:277 - Triggered transition abort (complete=0, tag=lrm_rsc_op, id=ONE-OCFS2-datastores:3_last_failure_0, magic=0:1;84:12:0:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.71) : Event failed Oct 3 14:47:26 nebula3 crmd: [4413]: notice: run_graph: ==== Transition 12 (Complete=12, Pending=0, Fired=0, Skipped=2, Incomplete=1, Source=/var/lib/pengine/pe-input-1430.bz2): Stopped Oct 3 14:47:26 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ] Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Storage-Clone from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on one: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on one: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: unpack_rsc_op: Processing failed op ONE-OCFS2-datastores:3_last_failure_0 on one: unknown error (1) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula3-IPMILAN resources. Chose nebula2. Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula2-IPMILAN resources. Chose nebula3. Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula1-IPMILAN resources. Chose nebula2. Oct 3 14:47:26 nebula3 pengine: [4412]: notice: LogActions: Recover ONE-OCFS2-datastores:3#011(Started one) Oct 3 14:47:26 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Oct 3 14:47:26 nebula3 crmd: [4413]: info: do_te_invoke: Processing graph 13 (ref=pe_calc-dc-1412340446-198) derived from /var/lib/pengine/pe-input-1431.bz2 Oct 3 14:47:26 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 25: stop ONE-OCFS2-datastores:3_stop_0 on one Oct 3 14:47:26 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair, id=status-one-fail-count-ONE-OCFS2-datastores.3, name=fail-count-ONE-OCFS2-datastores:3, value=INFINITY, magic=NA, cib=0.1226.72) : Transient attribute: update Oct 3 14:47:26 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair, id=status-one-last-failure-ONE-OCFS2-datastores.3, name=last-failure-ONE-OCFS2-datastores:3, value=1412340446, magic=NA, cib=0.1226.73) : Transient attribute: update Oct 3 14:47:26 nebula3 pengine: [4412]: notice: process_pe_message: Transition 13: PEngine Input stored in: /var/lib/pengine/pe-input-1431.bz2 Oct 3 14:47:26 nebula3 crmd: [4413]: notice: run_graph: ==== Transition 13 (Complete=3, Pending=0, Fired=0, Skipped=7, Incomplete=2, Source=/var/lib/pengine/pe-input-1431.bz2): Stopped Oct 3 14:47:26 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ] Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Storage-Clone from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on one: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on one: operation monitor failed 'not installed' (rc=5) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: unpack_rsc_op: Processing failed op ONE-OCFS2-datastores:3_last_failure_0 on one: unknown error (1) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula3-IPMILAN resources. Chose nebula2. Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula2-IPMILAN resources. Chose nebula3. Oct 3 14:47:26 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula1-IPMILAN resources. Chose nebula2. Oct 3 14:47:26 nebula3 pengine: [4412]: notice: LogActions: Stop dlm:3#011(one) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: LogActions: Stop o2cb:3#011(one) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: LogActions: Stop clvm:3#011(one) Oct 3 14:47:26 nebula3 pengine: [4412]: notice: LogActions: Stop ONE-vg:3#011(one) Oct 3 14:47:26 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Oct 3 14:47:26 nebula3 crmd: [4413]: info: do_te_invoke: Processing graph 14 (ref=pe_calc-dc-1412340446-200) derived from /var/lib/pengine/pe-input-1432.bz2 Oct 3 14:47:26 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 83: stop ONE-vg:3_stop_0 on one Oct 3 14:47:26 nebula3 pengine: [4412]: notice: process_pe_message: Transition 14: PEngine Input stored in: /var/lib/pengine/pe-input-1432.bz2 Oct 3 14:47:27 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 82: stop clvm:3_stop_0 on one Oct 3 14:47:28 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 81: stop o2cb:3_stop_0 on one Oct 3 14:47:28 nebula3 crmd: [4413]: info: te_rsc_command: Initiating action 80: stop dlm:3_stop_0 on one Oct 3 14:47:29 nebula3 crmd: [4413]: notice: run_graph: ==== Transition 14 (Complete=9, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pengine/pe-input-1432.bz2): Complete Oct 3 14:47:29 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Oct 3 14:48:20 nebula3 ocfs2_controld: kill node 1172809920 - ocfs2_controld PROCDOWN Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: initiate_remote_stonith_op: Initiating remote operation off for one: ed21fc6f-2540-491a-8643-2d0258bf2f60 Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: can_fence_host_with_device: Stonith-nebula2-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: can_fence_host_with_device: Stonith-Quorum-Node can not fence one: static-list Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: stonith_command: Processed st_query from nebula1: rc=0 Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: can_fence_host_with_device: Stonith-nebula2-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: can_fence_host_with_device: Stonith-Quorum-Node can not fence one: static-list Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: stonith_command: Processed st_query from nebula2: rc=0 Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: can_fence_host_with_device: Stonith-nebula2-IPMILAN can not fence one: static-list Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: can_fence_host_with_device: Stonith-Quorum-Node can not fence one: static-list Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: stonith_command: Processed st_query from nebula3: rc=0 Oct 3 14:48:20 nebula3 stonith-ng: [4409]: info: call_remote_stonith: Requesting that nebula1 perform op off one Oct 3 14:48:23 nebula3 corosync[3938]: [TOTEM ] A processor failed, forming new configuration. Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 22688: memb=4, new=0, lost=1 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: memb: quorum 1156032704 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: memb: nebula1 1189587136 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: memb: nebula2 1206364352 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: memb: nebula3 1223141568 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: lost: one 1172809920 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 22688: memb=4, new=0, lost=0 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: MEMB: quorum 1156032704 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: MEMB: nebula1 1189587136 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: MEMB: nebula2 1206364352 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: pcmk_peer_update: MEMB: nebula3 1223141568 Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: ais_mark_unseen_peer_dead: Node one was not seen in the previous transition Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: update_member: Node 1172809920/one is now: lost Oct 3 14:48:27 nebula3 corosync[3938]: [pcmk ] info: send_member_notification: Sending membership update 22688 to 4 children Oct 3 14:48:27 nebula3 corosync[3938]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 3 14:48:27 nebula3 cluster-dlm: [4867]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula3 crmd: [4413]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula3 cib: [4408]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula3 cluster-dlm: [4867]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000000000 Oct 3 14:48:27 nebula3 crmd: [4413]: info: ais_status_callback: status: one is now lost (was member) Oct 3 14:48:27 nebula3 crmd: [4413]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000111312 Oct 3 14:48:27 nebula3 cib: [4408]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000111312 Oct 3 14:48:27 nebula3 stonith-ng: [4409]: notice: remote_op_done: Operation off of one by nebula1 for nebula1[83164f01-342e-4838-a640-ef55c7905465]: OK Oct 3 14:48:27 nebula3 stonith-ng: [4409]: notice: remote_op_done: Operation off of one by nebula1 for nebula2[a5074eb0-6afa-4060-b3b9-d05e846e0c57]: OK Oct 3 14:48:27 nebula3 stonith-ng: [4409]: notice: remote_op_done: Operation off of one by nebula1 for nebula3[1fb319d9-d388-44d4-97a9-212746707e22]: OK Oct 3 14:48:27 nebula3 ocfs2_controld: Could not kick node 1172809920 from the cluster Oct 3 14:48:27 nebula3 ocfs2_controld: [4920]: info: ais_dispatch_message: Membership 22688: quorum retained Oct 3 14:48:27 nebula3 ocfs2_controld: [4920]: info: crm_update_peer: Node one: id=1172809920 state=lost (new) addr=r(0) ip(192.168.231.69) votes=1 born=22684 seen=22684 proc=00000000000000000000000000000000 Oct 3 14:48:27 nebula3 kernel: [ 649.172884] dlm: closing connection to node 1172809920 Oct 3 14:48:27 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/196, version=0.1226.79): ok (rc=0) Oct 3 14:48:27 nebula3 corosync[3938]: [CPG ] chosen downlist: sender r(0) ip(192.168.231.68) ; members(old:5 left:1) Oct 3 14:48:27 nebula3 crmd: [4413]: info: crmd_ais_dispatch: Setting expected votes to 5 Oct 3 14:48:27 nebula3 crmd: [4413]: WARN: match_down_event: No match for shutdown action on one Oct 3 14:48:27 nebula3 crmd: [4413]: info: te_update_diff: Stonith/shutdown of one not matched Oct 3 14:48:27 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:234 - Triggered transition abort (complete=1, tag=node_state, id=one, magic=NA, cib=0.1226.80) : Node failure Oct 3 14:48:27 nebula3 crmd: [4413]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula1: OK (ref=e2683312-e06f-44fe-8d65-852a918b7a3c) Oct 3 14:48:27 nebula3 crmd: [4413]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula2: OK (ref=443f2db0-bb48-4b1f-9179-f64cb587a22c) Oct 3 14:48:27 nebula3 crmd: [4413]: notice: tengine_stonith_notify: Peer one was terminated (off) by nebula1 for nebula3: OK (ref=ed21fc6f-2540-491a-8643-2d0258bf2f60) Oct 3 14:48:27 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Oct 3 14:48:27 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/199, version=0.1226.81): ok (rc=0) Oct 3 14:48:27 nebula3 corosync[3938]: [MAIN ] Completed service synchronization, ready to provide service. Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: pe_fence_node: Node one will be fenced because it is un-expectedly down Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: determine_online_status: Node one is unclean Oct 3 14:48:27 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:48:27 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Storage-Clone from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:48:27 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:48:27 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on one: operation monitor failed 'not installed' (rc=5) Oct 3 14:48:27 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on one: operation monitor failed 'not installed' (rc=5) Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: unpack_rsc_op: Processing failed op ONE-OCFS2-datastores:3_last_failure_0 on one: unknown error (1) Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: common_apply_stickiness: Forcing ONE-Storage-Clone away from one after 1000000 failures (max=1000000) Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula3-IPMILAN resources. Chose nebula2. Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula2-IPMILAN resources. Chose nebula3. Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula1-IPMILAN resources. Chose nebula2. Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: stage6: Scheduling Node one for STONITH Oct 3 14:48:27 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Oct 3 14:48:27 nebula3 crmd: [4413]: info: do_te_invoke: Processing graph 15 (ref=pe_calc-dc-1412340507-206) derived from /var/lib/pengine/pe-warn-125.bz2 Oct 3 14:48:27 nebula3 crmd: [4413]: notice: te_fence_node: Executing reboot fencing operation (101) on one (timeout=30000) Oct 3 14:48:27 nebula3 stonith-ng: [4409]: info: initiate_remote_stonith_op: Initiating remote operation reboot for one: 23222b1a-8499-4aa0-9964-269cad2a2f9f Oct 3 14:48:27 nebula3 stonith-ng: [4409]: info: can_fence_host_with_device: Stonith-nebula2-IPMILAN can not fence one: static-list Oct 3 14:48:27 nebula3 stonith-ng: [4409]: info: can_fence_host_with_device: Stonith-Quorum-Node can not fence one: static-list Oct 3 14:48:27 nebula3 stonith-ng: [4409]: info: stonith_command: Processed st_query from nebula3: rc=0 Oct 3 14:48:27 nebula3 stonith-ng: [4409]: info: call_remote_stonith: Requesting that nebula1 perform op reboot one Oct 3 14:48:27 nebula3 pengine: [4412]: WARN: process_pe_message: Transition 15: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-125.bz2 Oct 3 14:48:27 nebula3 pengine: [4412]: notice: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues. Oct 3 14:48:31 nebula3 stonith-ng: [4409]: notice: remote_op_done: Operation reboot of one by nebula1 for nebula3[2a9f4455-6b1d-42f7-9330-2a44ff6177f0]: OK Oct 3 14:48:31 nebula3 crmd: [4413]: info: tengine_stonith_callback: StonithOp <st-reply st_origin="stonith_construct_async_reply" t="stonith-ng" st_op="reboot" st_remote_op="23222b1a-8499-4aa0-9964-269cad2a2f9f" st_clientid="2a9f4455-6b1d-42f7-9330-2a44ff6177f0" st_target="one" st_device_action="st_fence" st_callid="0" st_callopt="0" st_rc="0" st_output="Performing: stonith -t external/libvirt -T reset one#012success: one 0#012" src="nebula1" seq="10" state="2" /> Oct 3 14:48:31 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='one']/lrm Oct 3 14:48:31 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='one']/transient_attributes Oct 3 14:48:31 nebula3 crmd: [4413]: notice: crmd_peer_update: Status update: Client one/crmd now has status [offline] (DC=true) Oct 3 14:48:31 nebula3 crmd: [4413]: notice: tengine_stonith_notify: Peer one was terminated (reboot) by nebula1 for nebula3: OK (ref=23222b1a-8499-4aa0-9964-269cad2a2f9f) Oct 3 14:48:31 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=check_join_state ] Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: do_te_invoke:169 - Triggered transition abort (complete=0) : Peer Halt Oct 3 14:48:31 nebula3 crmd: [4413]: notice: run_graph: ==== Transition 15 (Complete=2, Pending=0, Fired=0, Skipped=2, Incomplete=0, Source=/var/lib/pengine/pe-warn-125.bz2): Stopped Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: do_te_invoke:169 - Triggered transition abort (complete=1) : Peer Halt Oct 3 14:48:31 nebula3 crmd: [4413]: info: join_make_offer: Making join offers based on membership 22688 Oct 3 14:48:31 nebula3 crmd: [4413]: info: do_dc_join_offer_all: join-9: Waiting on 4 outstanding join acks Oct 3 14:48:31 nebula3 crmd: [4413]: info: update_dc: Set DC to nebula3 (3.0.6) Oct 3 14:48:31 nebula3 crmd: [4413]: info: cib_fencing_updated: Fencing update 201 for one: complete Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='one']/lrm (origin=local/crmd/202, version=0.1226.83): ok (rc=0) Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='one']/transient_attributes (origin=local/crmd/203, version=0.1226.84): ok (rc=0) Oct 3 14:48:31 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ] Oct 3 14:48:31 nebula3 crmd: [4413]: info: do_dc_join_finalize: join-9: Syncing the CIB from nebula3 to the rest of the cluster Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_sync for section 'all' (origin=local/crmd/206, version=0.1226.85): ok (rc=0) Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/207, version=0.1226.86): ok (rc=0) Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/208, version=0.1226.87): ok (rc=0) Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/209, version=0.1226.88): ok (rc=0) Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/210, version=0.1226.89): ok (rc=0) Oct 3 14:48:31 nebula3 crmd: [4413]: info: do_dc_join_ack: join-9: Updating node state to member for quorum Oct 3 14:48:31 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='quorum']/lrm Oct 3 14:48:31 nebula3 crmd: [4413]: info: do_dc_join_ack: join-9: Updating node state to member for nebula2 Oct 3 14:48:31 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='nebula2']/lrm Oct 3 14:48:31 nebula3 crmd: [4413]: info: do_dc_join_ack: join-9: Updating node state to member for nebula1 Oct 3 14:48:31 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='nebula1']/lrm Oct 3 14:48:31 nebula3 crmd: [4413]: info: do_dc_join_ack: join-9: Updating node state to member for nebula3 Oct 3 14:48:31 nebula3 crmd: [4413]: info: erase_status_tag: Deleting xpath: //node_state[@uname='nebula3']/lrm Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='quorum']/lrm (origin=local/crmd/211, version=0.1226.102): ok (rc=0) Oct 3 14:48:31 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ] Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: do_te_invoke:162 - Triggered transition abort (complete=1) : Peer Cancelled Oct 3 14:48:31 nebula3 attrd: [4411]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Oct 3 14:48:31 nebula3 attrd: [4411]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='nebula2']/lrm (origin=local/crmd/213, version=0.1226.104): ok (rc=0) Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, id=Stonith-nebula3-IPMILAN_last_0, magic=0:0;29:0:0:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.104) : Resource op removal Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:276 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.1226.105) : LRM Refresh Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='nebula1']/lrm (origin=local/crmd/215, version=0.1226.106): ok (rc=0) Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, id=Stonith-nebula3-IPMILAN_last_0, magic=0:7;9:1:7:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.106) : Resource op removal Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:276 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.1226.107) : LRM Refresh Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='nebula3']/lrm (origin=local/crmd/217, version=0.1226.108): ok (rc=0) Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, id=Stonith-nebula3-IPMILAN_last_0, magic=0:7;4:0:7:1659a791-f3c9-4c85-a6fa-1f601f55db82, cib=0.1226.108) : Resource op removal Oct 3 14:48:31 nebula3 crmd: [4413]: info: abort_transition_graph: te_update_diff:276 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.1226.109) : LRM Refresh Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/219, version=0.1226.110): ok (rc=0) Oct 3 14:48:31 nebula3 cib: [4408]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/221, version=0.1226.112): ok (rc=0) Oct 3 14:48:31 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing Quorum-Node from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:48:31 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Storage-Clone from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:48:31 nebula3 pengine: [4412]: notice: unpack_rsc_op: Preventing ONE-Frontend from re-starting on quorum: operation monitor failed 'not installed' (rc=5) Oct 3 14:48:31 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula3-IPMILAN resources. Chose nebula2. Oct 3 14:48:31 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula2-IPMILAN resources. Chose nebula3. Oct 3 14:48:31 nebula3 pengine: [4412]: WARN: native_choose_node: 3 nodes with equal score (INFINITY) for running Stonith-nebula1-IPMILAN resources. Chose nebula2. Oct 3 14:48:31 nebula3 crmd: [4413]: notice: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Oct 3 14:48:31 nebula3 crmd: [4413]: info: do_te_invoke: Processing graph 16 (ref=pe_calc-dc-1412340511-217) derived from /var/lib/pengine/pe-input-1433.bz2 Oct 3 14:48:31 nebula3 crmd: [4413]: notice: run_graph: ==== Transition 16 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pengine/pe-input-1433.bz2): Complete
-- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF
Attachment:
signature.asc
Description: PGP signature
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster