rolling upgrade of corosync/pacemaker cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We are planning to upgrade a pacemaker cluster on Ubuntu natty (11.04) to
oneiric (11.10) which involves moving from corosync version 1.2.1 to 1.3.0,
and pacemaker from version 1.0.10 to 1.1.5. We decided to test the process by
creating a simple cluster running natty and introduce a new node running
oneiric. When we did this the corosync services on all the natty nodes crashed.


Is this something that we should be able to do? ie should nodes running version
1.2.1 of corosync interoperate with nodes running version 1.3.0 in the same
corosync cluster?

Any suggestions for how to do a roling upgrade of a pacemaker/corosync cluster?
If this is not the correct place for this question, is there a better forum?

Thanks
Ollie

Syslogs from the nodes is below.

Before starting corosync on oneiric we have three natty nodes (tstnode4,6,7) in a cluster with no resources configured. Corosync version 1.3.0 is installed on
tstnode5, but the service has not been started yet:

root@tstnode4:~# crm status
============
Last updated: Tue Jan 15 11:35:12 2013
Stack: openais
Current DC: tstnode4 - partition with quorum
Version: 1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3
3 Nodes configured, 3 expected votes
0 Resources configured.
============

Online: [ tstnode4 tstnode6 tstnode7 ]

root@tstnode4:~#

We started corosync no tstnode5 at about 13:19 , here is syslog on tstnode4,
showing the corosync service stopping:

Jan 15 13:17:01 tstnode4 CRON[16754]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 208: memb=3, new=0, lost=0 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: pcmk_peer_update: memb: tstnode4 1756954634 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: pcmk_peer_update: memb: tstnode6 1790509066 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: pcmk_peer_update: memb: tstnode7 1807286282 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 208: memb=4, new=1, lost=0 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: update_member: Creating entry for node 1773731850 born on 208 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: update_member: Node 1773731850/unknown is now: member Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 1773731850 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: pcmk_peer_update: MEMB: tstnode4 1756954634 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 1773731850 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: pcmk_peer_update: MEMB: tstnode6 1790509066 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: pcmk_peer_update: MEMB: tstnode7 1807286282 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: send_member_notification: Sending membership update 208 to 2 children Jan 15 13:19:26 tstnode4 corosync[16656]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_dispatch: Membership 208: quorum retained Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_new_peer: Node <null> now has id: 1773731850 Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_update_peer: Node (null): id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105) votes=0 born=0 seen=208 proc=00000000000000000000000000000000 Jan 15 13:19:26 tstnode4 cib: [16665]: info: ais_dispatch: Membership 208: quorum retained Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_new_peer: Node <null> now has id: 1773731850 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: update_member: 0x20388e0 Node 1773731850 (tstnode5) born on: 208 Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_update_peer: Node (null): id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105) votes=0 born=0 seen=208 proc=00000000000000000000000000000000 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: update_member: 0x20388e0 Node 1773731850 now known as tstnode5 (was: (null)) Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: update_member: Node tstnode5 now has process list: 00000000000000000000000000111312 (1118994) Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: update_member: Node tstnode5 now has 1 quorum votes (was 0) Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: update_expected_votes: Expected quorum votes 3 -> 4 Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: send_member_notification: Sending membership update 208 to 2 children Jan 15 13:19:26 tstnode4 cib: [16665]: info: ais_dispatch: Membership 208: quorum retained Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_get_peer: Node 1773731850 is now known as tstnode5 Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_update_peer: Node tstnode5: id=1773731850 state=member addr=r(0) ip(10.0.185.105) votes=1 (new) born=208 seen=208 proc=00000000000000000000000000111312 (new) Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/37, version=0.15.4): ok (rc=0) Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_ais_dispatch: Setting expected votes to 3 Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_dispatch: Membership 208: quorum retained Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_get_peer: Node 1773731850 is now known as tstnode5 Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_status_callback: status: tstnode5 is now member Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_update_peer: Node tstnode5: id=1773731850 state=member addr=r(0) ip(10.0.185.105) votes=1 (new) born=208 seen=208 proc=00000000000000000000000000111312 (new) Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/40, version=0.15.4): ok (rc=0) Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: - <cib admin_epoch="0" epoch="15" num_updates="4" /> Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: + <cib admin_epoch="0" epoch="16" num_updates="1" > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: + <configuration > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: + <nodes > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: + <node id="tstnode5" uname="tstnode5" type="normal" __crm_diff_marker__="added:top" /> Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: + </nodes> Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: + </configuration> Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: + </cib> Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/41, version=0.16.1): ok (rc=0) Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Resource temporarily unavailable (11) Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: ais_dispatch: AIS connection failed Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Resource temporarily unavailable (11) Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Resource temporarily unavailable (11)
Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: AIS connection terminated
Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: AIS connection failed Jan 15 13:19:27 tstnode4 attrd: [16667]: CRIT: attrd_ais_destroy: Lost connection to OpenAIS service!
Jan 15 13:19:27 tstnode4 attrd: [16667]: info: main: Exiting...
Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: attrd_cib_connection_destroy: Connection to the CIB terminated... Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: send_ais_text: Sending message 24: FAILED (rc=2): Library error: Connection timed out (110) Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: send_ais_text: Sending message 20: FAILED (rc=2): Library error: Connection timed out (110) Jan 15 13:19:28 tstnode4 crmd: [16615]: ERROR: send_ais_text: Sending message 0: FAILED (rc=2): Library error: Connection timed out (110) Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: send_ais_message: Not connected to AIS Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Resource temporarily unavailable (11) Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: ais_dispatch: AIS connection failed Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: cib_ais_destroy: AIS connection terminated Jan 15 13:19:28 tstnode4 crmd: [16669]: info: crm_ais_dispatch: Setting expected votes to 4 Jan 15 13:19:28 tstnode4 crmd: [16669]: info: abort_transition_graph: need_abort:59 - Triggered transition abort (complete=1) : Non-status change Jan 15 13:19:28 tstnode4 crmd: [16669]: info: need_abort: Aborting on change to admin_epoch Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Resource temporarily unavailable (11) Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: ais_dispatch: AIS connection failed Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: crm_ais_destroy: AIS connection terminated Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: get_ais_nodeid: Sending nodeid request: FAILED (rc=2): Library error Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: init_ais_connection_once: Node name mismatch! OpenAIS supplied (null), our lookup returned tstnode4 Jan 15 13:19:29 tstnode4 crmd: [16615]: notice: init_ais_connection_once: Node name mismatches usually occur when assigned automatically by DHCP servers Jan 15 13:19:29 tstnode4 crmd: [16615]: notice: init_ais_connection_once: If this node was part of the cluster with a different name, you will need to remove the old entry with crm_node --remove Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: crm_abort: crm_strdup_fn: Triggered assert at utils.c:782 : src != NULL Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_ha_control: Connected to the cluster Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_started: Delaying start, CCM (0000000000100000) not connected Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: send_ipc_message: IPC Channel to 16665 is not connected Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: cib_native_perform_op: Sending message to CIB service FAILED Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: config_query_callback: Local CIB query resulted in an error: send failed Jan 15 13:19:29 tstnode4 crmd: [16615]: info: cib_native_msgready: Lost connection to the CIB service [16665]. Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: cib_native_dispatch: Lost connection to the CIB service [16665/callback]. Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: cib_native_dispatch: Lost connection to the CIB service [16665/command]. Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: crmd_cib_connection_destroy: Connection to the CIB terminated... Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input I_ERROR from config_query_callback() received in state S_STARTING Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_state_transition: State transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=config_query_callback ] Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_recover: Action A_RECOVER (0000000001000000) not supported Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_started: Start cancelled... S_RECOVERY Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input I_ERROR from crmd_cib_connection_destroy() received in state S_RECOVERY Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_dc_release: DC role released Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_te_control: Transitioner is now inactive Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_state_transition: State transition S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE cause=C_FSA_INTERNAL origin=do_recover ] Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_lrm_control: Disconnected from the LRM Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_ha_control: Disconnected from OpenAIS Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_cib_control: Disconnecting CIB Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_exit: Performing A_EXIT_0 - gracefully exiting the CRMd Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_exit: Could not recover from internal error Jan 15 13:19:29 tstnode4 crmd: [16615]: info: free_mem: Dropping I_RELEASE_SUCCESS: [ state=S_TERMINATE cause=C_FSA_INTERNAL origin=do_dc_release ] Jan 15 13:19:29 tstnode4 crmd: [16615]: info: free_mem: Dropping I_TERMINATE: [ state=S_TERMINATE cause=C_FSA_INTERNAL origin=do_stop ]
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_exit: [crmd] stopped (2)



At this time corosync has crashed on the three natty nodes and this is how tstnode5 sees the cluster:

root@tstnode5:~# crm status
============
Last updated: Tue Jan 15 13:11:01 2013
Stack: openais
Current DC: tstnode5 - partition WITHOUT quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
4 Nodes configured, 4 expected votes
0 Resources configured.
============

Online: [ tstnode5 ]
OFFLINE: [ tstnode6 tstnode7 tstnode4 ]

And the cluster configuration is: (we disabled stonith because it is not
installed on this test cluster.)

root@tstnode5:~# crm configure show
node tstnode4
node tstnode5
node tstnode6
node tstnode7
property $id="cib-bootstrap-options" \
    dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
    cluster-infrastructure="openais" \
    expected-quorum-votes="4" \
    stonith-enabled="false"
root@tstnode5:~#


root@tstnode5:~# crm_verify -LV
crm_verify[2534]: 2013/01/15_13:30:33 WARN: cluster_status: We do not have quorum - fencing and resource management disabled
root@tstnode5:~#


Here is syslog on tstnode5 at the corresponding time:
Jan 15 13:17:01 tstnode5 CRON[2464]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Corosync Cluster Engine ('1.3.0'): started and ready to provide service. Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Corosync built-in features: nss Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'. Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] Initializing transport (UDP/IP Multicast). Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] The network interface [10.0.185.105] is now up. Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: process_ais_conf: Reading configure Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: config_find_init: Local handle: 2013064636357672963 for logging Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: config_find_next: Processing additional logging options... Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Found 'off' for option: debug Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Found 'yes' for option: to_logfile Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Found '/var/log/corosync/corosync.log' for option: logfile Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Found 'yes' for option: to_syslog Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: config_find_init: Local handle: 4730966301143465988 for quorum Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: config_find_next: No additional configuration supplied for: quorum Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: No default for option: provider Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: config_find_init: Local handle: 7739444317642555397 for service Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: config_find_next: Processing additional service options... Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Found '0' for option: ver Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: Invoked: /usr/lib/heartbeat/stonithd Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: get_cluster_type: Cluster type is: 'openais'. Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin) Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Defaulting to 'no' for option: use_logd Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: get_config_opt: Defaulting to 'no' for option: use_mgmtd Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup: CRM: Initialized Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] Logging: Initialized pcmk_startup Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup: Service: 9 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup: Local hostname: tstnode5 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_update_nodeid: Local node id: 1773731850 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Creating entry for node 1773731850 born on 0 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: 0x7fd27c0023b0 Node 1773731850 now known as tstnode5 (was: (null)) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node tstnode5 now has 1 quorum votes (was 0) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node 1773731850/tstnode5 is now: member Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: Forked child 2492 for process stonith-ng Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: Forked child 2493 for process cib Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: Forked child 2494 for process lrmd Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: Forked child 2495 for process attrd Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: Forked child 2496 for process pengine Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: Forked child 2497 for process crmd Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine loaded: Pacemaker Cluster Manager 1.1.5 Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine loaded: corosync configuration service Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine loaded: corosync profile loading service Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: init_ais_connection_classic: AIS connection established Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 204: memb=0, new=0, lost=0 Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: get_ais_nodeid: Server details: id=1773731850 uname=tstnode5 cname=pcmk Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 204: memb=1, new=1, lost=0 Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: NEW: tstnode5 1773731850 Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_new_peer: Node tstnode5 now has id: 1773731850 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: MEMB: tstnode5 1773731850 Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_new_peer: Node 1773731850 is now known as tstnode5 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node tstnode5 now has process list: 00000000000000000000000000111312 (1118994) Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: main: Starting stonith-ng mainloop Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] chosen downlist from node r(0) ip(10.0.185.105) Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Completed service synchronization, ready to provide service. Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: Recorded connection 0x1f7c800 for stonith-ng/2492 Jan 15 13:19:26 tstnode5 attrd: [2495]: info: Invoked: /usr/lib/heartbeat/attrd Jan 15 13:19:26 tstnode5 pengine: [2496]: info: Invoked: /usr/lib/heartbeat/pengine Jan 15 13:19:26 tstnode5 crmd: [2497]: info: Invoked: /usr/lib/heartbeat/crmd Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Starting up
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: get_cluster_type: Cluster type is: 'openais'. Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin) Jan 15 13:19:26 tstnode5 attrd: [2495]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jan 15 13:19:26 tstnode5 cib: [2493]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig) Jan 15 13:19:26 tstnode5 crmd: [2497]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster Jan 15 13:19:26 tstnode5 crmd: [2497]: info: main: CRM Hg Version: 01e86afaaa6d4a8c4836f68df80ababd6ca3902f
Jan 15 13:19:26 tstnode5 crmd: [2497]: info: crmd_init: Starting crmd
Jan 15 13:19:26 tstnode5 cib: [2493]: info: validate_with_relaxng: Creating RNG parser context
Jan 15 13:19:26 tstnode5 lrmd: [2494]: info: enabling coredumps
Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Core dumps could be lost if multiple dumps occur. Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
Jan 15 13:19:26 tstnode5 lrmd: [2494]: info: Started.
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: init_ais_connection_classic: AIS connection established Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: Recorded connection 0x1f80b60 for attrd/2495 Jan 15 13:19:26 tstnode5 attrd: [2495]: info: get_ais_nodeid: Server details: id=1773731850 uname=tstnode5 cname=pcmk Jan 15 13:19:26 tstnode5 attrd: [2495]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_new_peer: Node tstnode5 now has id: 1773731850 Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_new_peer: Node 1773731850 is now known as tstnode5 Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Cluster connection active Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Accepting attribute updates
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Starting mainloop...
Jan 15 13:19:26 tstnode5 cib: [2493]: info: startCib: CIB Initialization completed successfully Jan 15 13:19:26 tstnode5 cib: [2493]: info: get_cluster_type: Cluster type is: 'openais'. Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin) Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_classic: AIS connection established Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: Recorded connection 0x1f84ec0 for cib/2493 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: Sending membership update 204 to cib Jan 15 13:19:26 tstnode5 cib: [2493]: info: get_ais_nodeid: Server details: id=1773731850 uname=tstnode5 cname=pcmk Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node tstnode5 now has id: 1773731850 Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node 1773731850 is now known as tstnode5
Jan 15 13:19:26 tstnode5 cib: [2493]: info: cib_init: Starting cib mainloop
Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message: Membership 204: quorum still lost Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node tstnode5: id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105) (new) votes=1 (new) born=0 seen=204 proc=00000000000000000000000000111312 (new) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 208: memb=1, new=0, lost=0 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: memb: tstnode5 1773731850 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 208: memb=4, new=3, lost=0 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Creating entry for node 1756954634 born on 208 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node 1756954634/unknown is now: member Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 1756954634 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Creating entry for node 1790509066 born on 208 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node 1790509066/unknown is now: member Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 1790509066 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Creating entry for node 1807286282 born on 208 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node 1807286282/unknown is now: member Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 1807286282 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 1756954634 Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message: Membership 208: quorum still lost Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: MEMB: tstnode5 1773731850 Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null> now has id: 1756954634 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 1790509066 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 1807286282 Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node (null): id=1756954634 state=member (new) addr=r(0) ip(10.0.185.104) votes=0 born=0 seen=208 proc=00000000000000000000000000000000 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: send_member_notification: Sending membership update 208 to 1 children Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null> now has id: 1790509066 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: 0x7fd27c0023b0 Node 1773731850 ((null)) born on: 208 Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node (null): id=1790509066 state=member (new) addr=r(0) ip(10.0.185.106) votes=0 born=0 seen=208 proc=00000000000000000000000000000000 Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null> now has id: 1807286282 Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node (null): id=1807286282 state=member (new) addr=r(0) ip(10.0.185.107) votes=0 born=0 seen=208 proc=00000000000000000000000000000000 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: 0x1f705c0 Node 1790509066 (tstnode6) born on: 196 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: 0x1f705c0 Node 1790509066 now known as tstnode6 (was: (null)) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node tstnode6 now has process list: 00000000000000000000000000013312 (78610) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node tstnode6 now has 1 quorum votes (was 0) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: send_member_notification: Sending membership update 208 to 1 children Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) Jan 15 13:19:26 tstnode5 cib: [2493]: notice: ais_dispatch_message: Membership 208: quorum acquired Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node 1790509066 is now known as tstnode6 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: 0x1f707f0 Node 1807286282 (tstnode7) born on: 204 Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node tstnode6: id=1790509066 state=member addr=r(0) ip(10.0.185.106) votes=1 (new) born=196 seen=208 proc=00000000000000000000000000013312 (new) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: 0x1f707f0 Node 1807286282 now known as tstnode7 (was: (null)) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node tstnode7 now has process list: 00000000000000000000000000013312 (78610) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node tstnode7 now has 1 quorum votes (was 0) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_expected_votes: Expected quorum votes 2 -> 3 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: send_member_notification: Sending membership update 208 to 1 children Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: 0x1f703b0 Node 1756954634 (tstnode4) born on: 196 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: 0x1f703b0 Node 1756954634 now known as tstnode4 (was: (null)) Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message: Membership 208: quorum retained Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node tstnode4 now has process list: 00000000000000000000000000013312 (78610) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node tstnode4 now has 1 quorum votes (was 0) Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node 1807286282 is now known as tstnode7 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_expected_votes: Expected quorum votes 3 -> 4 Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: send_member_notification: Sending membership update 208 to 1 children Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node tstnode7: id=1807286282 state=member addr=r(0) ip(10.0.185.107) votes=1 (new) born=204 seen=208 proc=00000000000000000000000000013312 (new) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message: Membership 208: quorum retained Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node 1756954634 is now known as tstnode4 Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node tstnode4: id=1756954634 state=member addr=r(0) ip(10.0.185.104) votes=1 (new) born=196 seen=208 proc=00000000000000000000000000013312 (new) Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from node 1790509066 Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from node 1807286282 Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from node 1756954634 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_cib_control: CIB connection established Jan 15 13:19:27 tstnode5 crmd: [2497]: info: get_cluster_type: Cluster type is: 'openais'. Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jan 15 13:19:27 tstnode5 crmd: [2497]: info: init_ais_connection_classic: AIS connection established Jan 15 13:19:27 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: Recorded connection 0x7fd27c002800 for crmd/2497 Jan 15 13:19:27 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: Sending membership update 208 to crmd Jan 15 13:19:27 tstnode5 crmd: [2497]: info: get_ais_nodeid: Server details: id=1773731850 uname=tstnode5 cname=pcmk Jan 15 13:19:27 tstnode5 crmd: [2497]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode5 now has id: 1773731850 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node 1773731850 is now known as tstnode5 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode5 is now unknown Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_ha_control: Connected to the cluster Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: Delaying start, no membership data (0000000000100000) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crmd_init: Starting crmd's mainloop Jan 15 13:19:27 tstnode5 crmd: [2497]: notice: ais_dispatch_message: Membership 208: quorum acquired Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode4 now has id: 1756954634 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node 1756954634 is now known as tstnode4 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode4 is now unknown Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode4 is now member (was unknown) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node tstnode4: id=1756954634 state=member (new) addr=r(0) ip(10.0.185.104) votes=1 born=196 seen=208 proc=00000000000000000000000000013312 Jan 15 13:19:27 tstnode5 crmd: [2497]: notice: crmd_peer_update: Status update: Client tstnode5/crmd now has status [online] (DC=<null>) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode5 is now member (was unknown) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node tstnode5: id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105) (new) votes=1 (new) born=208 seen=208 proc=00000000000000000000000000111312 (new) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode6 now has id: 1790509066 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node 1790509066 is now known as tstnode6 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode6 is now unknown Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode6 is now member (was unknown) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node tstnode6: id=1790509066 state=member (new) addr=r(0) ip(10.0.185.106) votes=1 born=196 seen=208 proc=00000000000000000000000000013312 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode7 now has id: 1807286282 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node 1807286282 is now known as tstnode7 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode7 is now unknown Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode7 is now member (was unknown) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node tstnode7: id=1807286282 state=member (new) addr=r(0) ip(10.0.185.107) votes=1 born=204 seen=208 proc=00000000000000000000000000013312 Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: Delaying start, Config not read (0000000000000040) Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback: Shutdown escalation occurs after: 1200000ms Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback: Checking for expired actions every 900000ms Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback: Sending expected-votes=4 to corosync Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: The local CRM is operational Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ] Jan 15 13:19:27 tstnode5 corosync[2484]: [TOTEM ] A processor failed, forming new configuration. Jan 15 13:19:28 tstnode5 crmd: [2497]: info: ais_dispatch_message: Membership 208: quorum retained Jan 15 13:19:28 tstnode5 crmd: [2497]: info: te_connect_stonith: Attempting connection to fencing daemon... Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 212: memb=1, new=0, lost=3 Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: memb: tstnode5 1773731850 Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: lost: tstnode4 1756954634 Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: lost: tstnode6 1790509066 Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: lost: tstnode7 1807286282 Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 212: memb=1, new=0, lost=0 Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: pcmk_peer_update: MEMB: tstnode5 1773731850 Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: ais_mark_unseen_peer_dead: Node tstnode4 was not seen in the previous transition Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node 1756954634/tstnode4 is now: lost Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: ais_mark_unseen_peer_dead: Node tstnode6 was not seen in the previous transition Jan 15 13:19:28 tstnode5 cib: [2493]: notice: ais_dispatch_message: Membership 212: quorum lost Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node 1790509066/tstnode6 is now: lost Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node tstnode4: id=1756954634 state=lost (new) addr=r(0) ip(10.0.185.104) votes=1 born=196 seen=208 proc=00000000000000000000000000013312 Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: ais_mark_unseen_peer_dead: Node tstnode7 was not seen in the previous transition Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member: Node 1807286282/tstnode7 is now: lost Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: send_member_notification: Sending membership update 212 to 2 children Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node tstnode6: id=1790509066 state=lost (new) addr=r(0) ip(10.0.185.106) votes=1 born=196 seen=208 proc=00000000000000000000000000013312 Jan 15 13:19:28 tstnode5 corosync[2484]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Jan 15 13:19:28 tstnode5 corosync[2484]: [CPG ] chosen downlist from node r(0) ip(10.0.185.105) Jan 15 13:19:28 tstnode5 corosync[2484]: [MAIN ] Completed service synchronization, ready to provide service. Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node tstnode7: id=1807286282 state=lost (new) addr=r(0) ip(10.0.185.107) votes=1 born=204 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:29 tstnode5 crmd: [2497]: info: te_connect_stonith: Connected
Jan 15 13:19:29 tstnode5 crmd: [2497]: notice: ais_dispatch_message: Membership 212: quorum lost Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode4 is now lost (was member) Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node tstnode4: id=1756954634 state=lost (new) addr=r(0) ip(10.0.185.104) votes=1 born=196 seen=208 proc=00000000000000000000000000013312 Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode6 is now lost (was member) Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node tstnode6: id=1790509066 state=lost (new) addr=r(0) ip(10.0.185.106) votes=1 born=196 seen=208 proc=00000000000000000000000000013312 Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback: status: tstnode7 is now lost (was member) Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node tstnode7: id=1807286282 state=lost (new) addr=r(0) ip(10.0.185.107) votes=1 born=204 seen=208 proc=00000000000000000000000000013312 Jan 15 13:19:31 tstnode5 attrd: [2495]: info: cib_connect: Connected to the CIB after 1 signon attempts Jan 15 13:19:31 tstnode5 attrd: [2495]: info: cib_connect: Sending full refresh Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped! (60000ms) Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped ] Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ] Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_te_control: Registering TE UUID: fef69ffc-bef2-475d-b3c4-ddaa94ab8af6 Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: cib_client_add_notify_callback: Callback already present Jan 15 13:20:28 tstnode5 crmd: [2497]: info: set_graph_functions: Setting custom graph functions Jan 15 13:20:28 tstnode5 crmd: [2497]: info: unpack_graph: Unpacked transition -1: 0 actions in 0 synapses Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_takeover: Taking over DC status for this partition Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_readwrite: We are now in R/W mode Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_master for section 'all' (origin=local/crmd/5, version=0.12.1): ok (rc=0) Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/6, version=0.12.2): ok (rc=0) Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/8, version=0.12.3): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: join_make_offer: Making join offers based on membership 212 Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_offer_all: join-1: Waiting on 1 outstanding join acks Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/10, version=0.12.4): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback: Shutdown escalation occurs after: 1200000ms Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback: Checking for expired actions every 900000ms Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback: Sending expected-votes=4 to corosync Jan 15 13:20:28 tstnode5 crmd: [2497]: info: ais_dispatch_message: Membership 212: quorum still lost Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crmd_ais_dispatch: Setting expected votes to 4 Jan 15 13:20:28 tstnode5 crmd: [2497]: info: update_dc: Set DC to tstnode5 (3.0.5) Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/13, version=0.12.5): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: ais_dispatch_message: Membership 212: quorum still lost Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crmd_ais_dispatch: Setting expected votes to 4 Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/16, version=0.12.6): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ] Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: All 1 cluster nodes responded to the join offer. Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_finalize: join-1: Syncing the CIB from tstnode5 to the rest of the cluster Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_sync for section 'all' (origin=local/crmd/17, version=0.12.6): ok (rc=0) Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/18, version=0.12.7): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: update_attrd: Connecting to attrd... Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating hash entry for terminate Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating hash entry for shutdown Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='tstnode5']/transient_attributes (origin=local/crmd/19, version=0.12.8): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: erase_xpath_callback: Deletion of "//node_state[@uname='tstnode5']/transient_attributes": ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_ack: join-1: Updating node state to member for tstnode5 Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='tstnode5']/lrm (origin=local/crmd/20, version=0.12.9): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: erase_xpath_callback: Deletion of "//node_state[@uname='tstnode5']/lrm": ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ] Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: All 1 cluster nodes are eligible to run resources. Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_final: Ensuring DC, quorum and node attributes are up-to-date Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crm_update_quorum: Updating quorum status to false (call=24) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph: do_te_invoke:173 - Triggered transition abort (complete=1) : Peer Cancelled Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_local_callback: Sending full refresh (origin=crmd) Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update: Sending flush op to all hosts for: terminate (<null>) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 25: Requesting the current CIB: S_POLICY_ENGINE Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/22, version=0.12.11): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match for shutdown action on tstnode4 Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff: Stonith/shutdown of tstnode4 not matched Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph: te_update_diff:200 - Triggered transition abort (complete=1, tag=node_state, id=tstnode4, magic=NA, cib=0.12.12) : Node failure Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match for shutdown action on tstnode6 Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff: Stonith/shutdown of tstnode6 not matched Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph: te_update_diff:200 - Triggered transition abort (complete=1, tag=node_state, id=tstnode6, magic=NA, cib=0.12.12) : Node failure Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match for shutdown action on tstnode7 Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff: Stonith/shutdown of tstnode7 not matched Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/24, version=0.12.13): ok (rc=0) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph: te_update_diff:200 - Triggered transition abort (complete=1, tag=node_state, id=tstnode7, magic=NA, cib=0.12.12) : Node failure Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 26: Requesting the current CIB: S_POLICY_ENGINE Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 27: Requesting the current CIB: S_POLICY_ENGINE Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 28: Requesting the current CIB: S_POLICY_ENGINE Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke_callback: Invoking the PE: query=28, ref=pe_calc-dc-1358256028-8, seq=212, quorate=0 Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update: Sending flush op to all hosts for: shutdown (<null>) Jan 15 13:20:28 tstnode5 pengine: [2496]: WARN: cluster_status: We do not have quorum - fencing and resource management disabled Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Jan 15 13:20:28 tstnode5 crmd: [2497]: info: unpack_graph: Unpacked transition 0: 1 actions in 1 synapses Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_te_invoke: Processing graph 0 (ref=pe_calc-dc-1358256028-8) derived from /var/lib/pengine/pe-input-30.bz2 Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_rsc_command: Initiating action 2: probe_complete probe_complete on tstnode5 (local) - no waiting Jan 15 13:20:28 tstnode5 crmd: [2497]: info: run_graph: ==================================================== Jan 15 13:20:28 tstnode5 crmd: [2497]: notice: run_graph: Transition 0 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pengine/pe-input-30.bz2): Complete Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_graph_trigger: Transition 0 is now complete Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating hash entry for probe_complete Jan 15 13:20:28 tstnode5 crmd: [2497]: info: notify_crmd: Transition 0 status: done - <null> Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: Starting PEngine Recheck Timer Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_perform_update: Sent update 8: probe_complete=true

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux