Both corosync 1.2.x and 1.3.x are unsupported by upstream for quite long time. I would suggest to upgrade to latest 1.4.5 from source code. Still I'm unsure if there is clear upgrade path. Regards, Honza Ollie Leahy napsal(a): > Hi, > > We are planning to upgrade a pacemaker cluster on Ubuntu natty (11.04) to > oneiric (11.10) which involves moving from corosync version 1.2.1 to 1.3.0, > and pacemaker from version 1.0.10 to 1.1.5. We decided to test the > process by > creating a simple cluster running natty and introduce a new node running > oneiric. When we did this the corosync services on all the natty nodes > crashed. > > > Is this something that we should be able to do? ie should nodes running > version > 1.2.1 of corosync interoperate with nodes running version 1.3.0 in the same > corosync cluster? > > Any suggestions for how to do a roling upgrade of a pacemaker/corosync > cluster? > If this is not the correct place for this question, is there a better > forum? > > Thanks > Ollie > > Syslogs from the nodes is below. > > Before starting corosync on oneiric we have three natty nodes > (tstnode4,6,7) in > a cluster with no resources configured. Corosync version 1.3.0 is > installed on > tstnode5, but the service has not been started yet: > > root@tstnode4:~# crm status > ============ > Last updated: Tue Jan 15 11:35:12 2013 > Stack: openais > Current DC: tstnode4 - partition with quorum > Version: 1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3 > 3 Nodes configured, 3 expected votes > 0 Resources configured. > ============ > > Online: [ tstnode4 tstnode6 tstnode7 ] > > root@tstnode4:~# > > We started corosync no tstnode5 at about 13:19 , here is syslog on > tstnode4, > showing the corosync service stopping: > > Jan 15 13:17:01 tstnode4 CRON[16754]: (root) CMD ( cd / && run-parts > --report /etc/cron.hourly) > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] notice: > pcmk_peer_update: Transitional membership event on ring 208: memb=3, > new=0, lost=0 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > pcmk_peer_update: memb: tstnode4 1756954634 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > pcmk_peer_update: memb: tstnode6 1790509066 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > pcmk_peer_update: memb: tstnode7 1807286282 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] notice: > pcmk_peer_update: Stable membership event on ring 208: memb=4, new=1, > lost=0 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > update_member: Creating entry for node 1773731850 born on 208 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > update_member: Node 1773731850/unknown is now: member > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > pcmk_peer_update: NEW: .pending. 1773731850 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > pcmk_peer_update: MEMB: tstnode4 1756954634 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > pcmk_peer_update: MEMB: .pending. 1773731850 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > pcmk_peer_update: MEMB: tstnode6 1790509066 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > pcmk_peer_update: MEMB: tstnode7 1807286282 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > send_member_notification: Sending membership update 208 to 2 children > Jan 15 13:19:26 tstnode4 corosync[16656]: [TOTEM ] A processor joined > or left the membership and a new membership was formed. > Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_dispatch: Membership > 208: quorum retained > Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_new_peer: Node <null> > now has id: 1773731850 > Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_update_peer: Node > (null): id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105) > votes=0 born=0 seen=208 proc=00000000000000000000000000000000 > Jan 15 13:19:26 tstnode4 cib: [16665]: info: ais_dispatch: Membership > 208: quorum retained > Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_new_peer: Node <null> > now has id: 1773731850 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > update_member: 0x20388e0 Node 1773731850 (tstnode5) born on: 208 > Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_update_peer: Node > (null): id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105) > votes=0 born=0 seen=208 proc=00000000000000000000000000000000 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > update_member: 0x20388e0 Node 1773731850 now known as tstnode5 (was: > (null)) > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > update_member: Node tstnode5 now has process list: > 00000000000000000000000000111312 (1118994) > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > update_member: Node tstnode5 now has 1 quorum votes (was 0) > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > update_expected_votes: Expected quorum votes 3 -> 4 > Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info: > send_member_notification: Sending membership update 208 to 2 children > Jan 15 13:19:26 tstnode4 cib: [16665]: info: ais_dispatch: Membership > 208: quorum retained > Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_get_peer: Node > 1773731850 is now known as tstnode5 > Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_update_peer: Node > tstnode5: id=1773731850 state=member addr=r(0) ip(10.0.185.105) votes=1 > (new) born=208 seen=208 proc=00000000000000000000000000111312 (new) > Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request: > Operation complete: op cib_modify for section nodes > (origin=local/crmd/37, version=0.15.4): ok (rc=0) > Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_ais_dispatch: Setting > expected votes to 3 > Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_dispatch: Membership > 208: quorum retained > Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_get_peer: Node > 1773731850 is now known as tstnode5 > Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_status_callback: > status: tstnode5 is now member > Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_update_peer: Node > tstnode5: id=1773731850 state=member addr=r(0) ip(10.0.185.105) votes=1 > (new) born=208 seen=208 proc=00000000000000000000000000111312 (new) > Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request: > Operation complete: op cib_modify for section crm_config > (origin=local/crmd/40, version=0.15.4): ok (rc=0) > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: > - <cib admin_epoch="0" epoch="15" num_updates="4" /> > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: > + <cib admin_epoch="0" epoch="16" num_updates="1" > > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: > + <configuration > > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: > + <nodes > > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: > + <node id="tstnode5" uname="tstnode5" type="normal" > __crm_diff_marker__="added:top" /> > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: > + </nodes> > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: > + </configuration> > Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff: > + </cib> > Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request: > Operation complete: op cib_modify for section nodes > (origin=local/crmd/41, version=0.16.1): ok (rc=0) > Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: ais_dispatch: > Receiving message body failed: (2) Library error: Resource temporarily > unavailable (11) > Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: ais_dispatch: AIS > connection failed > Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: Receiving > message body failed: (2) Library error: Resource temporarily unavailable > (11) > Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: Receiving > message body failed: (2) Library error: Resource temporarily unavailable > (11) > Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: AIS connection > terminated > Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: AIS > connection failed > Jan 15 13:19:27 tstnode4 attrd: [16667]: CRIT: attrd_ais_destroy: Lost > connection to OpenAIS service! > Jan 15 13:19:27 tstnode4 attrd: [16667]: info: main: Exiting... > Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: > attrd_cib_connection_destroy: Connection to the CIB terminated... > Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: send_ais_text: Sending > message 24: FAILED (rc=2): Library error: Connection timed out (110) > Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: send_ais_text: Sending > message 20: FAILED (rc=2): Library error: Connection timed out (110) > Jan 15 13:19:28 tstnode4 crmd: [16615]: ERROR: send_ais_text: Sending > message 0: FAILED (rc=2): Library error: Connection timed out (110) > Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: send_ais_message: Not > connected to AIS > Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: ais_dispatch: Receiving > message body failed: (2) Library error: Resource temporarily unavailable > (11) > Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: ais_dispatch: AIS > connection failed > Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: cib_ais_destroy: AIS > connection terminated > Jan 15 13:19:28 tstnode4 crmd: [16669]: info: crm_ais_dispatch: Setting > expected votes to 4 > Jan 15 13:19:28 tstnode4 crmd: [16669]: info: abort_transition_graph: > need_abort:59 - Triggered transition abort (complete=1) : Non-status change > Jan 15 13:19:28 tstnode4 crmd: [16669]: info: need_abort: Aborting on > change to admin_epoch > Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: ais_dispatch: Receiving > message body failed: (2) Library error: Resource temporarily unavailable > (11) > Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: ais_dispatch: AIS > connection failed > Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: crm_ais_destroy: AIS > connection terminated > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: get_ais_nodeid: Sending > nodeid request: FAILED (rc=2): Library error > Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: init_ais_connection_once: > Node name mismatch! OpenAIS supplied (null), our lookup returned tstnode4 > Jan 15 13:19:29 tstnode4 crmd: [16615]: notice: > init_ais_connection_once: Node name mismatches usually occur when > assigned automatically by DHCP servers > Jan 15 13:19:29 tstnode4 crmd: [16615]: notice: > init_ais_connection_once: If this node was part of the cluster with a > different name, you will need to remove the old entry with crm_node > --remove > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: crm_abort: crm_strdup_fn: > Triggered assert at utils.c:782 : src != NULL > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_ha_control: Connected > to the cluster > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_started: Delaying > start, CCM (0000000000100000) not connected > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: send_ipc_message: IPC > Channel to 16665 is not connected > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: cib_native_perform_op: > Sending message to CIB service FAILED > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: config_query_callback: > Local CIB query resulted in an error: send failed > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: cib_native_msgready: Lost > connection to the CIB service [16665]. > Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: cib_native_dispatch: Lost > connection to the CIB service [16665/callback]. > Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: cib_native_dispatch: Lost > connection to the CIB service [16665/command]. > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: > crmd_cib_connection_destroy: Connection to the CIB terminated... > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input > I_ERROR from config_query_callback() received in state S_STARTING > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_state_transition: State > transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL > origin=config_query_callback ] > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_recover: Action > A_RECOVER (0000000001000000) not supported > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_started: Start > cancelled... S_RECOVERY > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input > I_ERROR from crmd_cib_connection_destroy() received in state S_RECOVERY > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_dc_release: DC role > released > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_te_control: > Transitioner is now inactive > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input > I_TERMINATE from do_recover() received in state S_RECOVERY > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_state_transition: State > transition S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE > cause=C_FSA_INTERNAL origin=do_recover ] > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_lrm_control: > Disconnected from the LRM > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_ha_control: > Disconnected from OpenAIS > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_cib_control: > Disconnecting CIB > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_exit: Performing > A_EXIT_0 - gracefully exiting the CRMd > Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_exit: Could not > recover from internal error > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: free_mem: Dropping > I_RELEASE_SUCCESS: [ state=S_TERMINATE cause=C_FSA_INTERNAL > origin=do_dc_release ] > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: free_mem: Dropping > I_TERMINATE: [ state=S_TERMINATE cause=C_FSA_INTERNAL origin=do_stop ] > Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_exit: [crmd] stopped (2) > > > > At this time corosync has crashed on the three natty nodes and this is > how tstnode5 sees the cluster: > > root@tstnode5:~# crm status > ============ > Last updated: Tue Jan 15 13:11:01 2013 > Stack: openais > Current DC: tstnode5 - partition WITHOUT quorum > Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f > 4 Nodes configured, 4 expected votes > 0 Resources configured. > ============ > > Online: [ tstnode5 ] > OFFLINE: [ tstnode6 tstnode7 tstnode4 ] > > And the cluster configuration is: (we disabled stonith because it is not > installed on this test cluster.) > > root@tstnode5:~# crm configure show > node tstnode4 > node tstnode5 > node tstnode6 > node tstnode7 > property $id="cib-bootstrap-options" \ > dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="4" \ > stonith-enabled="false" > root@tstnode5:~# > > > root@tstnode5:~# crm_verify -LV > crm_verify[2534]: 2013/01/15_13:30:33 WARN: cluster_status: We do not > have quorum - fencing and resource management disabled > root@tstnode5:~# > > > Here is syslog on tstnode5 at the corresponding time: > Jan 15 13:17:01 tstnode5 CRON[2464]: (root) CMD ( cd / && run-parts > --report /etc/cron.hourly) > Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Corosync Cluster > Engine ('1.3.0'): started and ready to provide service. > Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Corosync built-in > features: nss > Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Successfully read > main configuration file '/etc/corosync/corosync.conf'. > Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] Initializing > transport (UDP/IP Multicast). > Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] Initializing > transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). > Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] The network > interface [10.0.185.105] is now up. > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > process_ais_conf: Reading configure > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > config_find_init: Local handle: 2013064636357672963 for logging > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > config_find_next: Processing additional logging options... > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Found 'off' for option: debug > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Found 'yes' for option: to_logfile > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Found '/var/log/corosync/corosync.log' for option: logfile > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Found 'yes' for option: to_syslog > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Defaulting to 'daemon' for option: syslog_facility > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > config_find_init: Local handle: 4730966301143465988 for quorum > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > config_find_next: No additional configuration supplied for: quorum > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: No default for option: provider > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > config_find_init: Local handle: 7739444317642555397 for service > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > config_find_next: Processing additional service options... > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Found '0' for option: ver > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Defaulting to 'pcmk' for option: clustername > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: Invoked: > /usr/lib/heartbeat/stonithd > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_log_init_worker: > Changed active directory to /var/lib/heartbeat/cores/root > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: get_cluster_type: > Cluster type is: 'openais'. > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_cluster_connect: > Connecting to cluster infrastructure: classic openais (with plugin) > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: > init_ais_connection_classic: Creating connection to our Corosync plugin > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Defaulting to 'no' for option: use_logd > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > get_config_opt: Defaulting to 'no' for option: use_mgmtd > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup: > CRM: Initialized > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] Logging: Initialized > pcmk_startup > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup: > Maximum core file size is: 18446744073709551615 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup: > Service: 9 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup: > Local hostname: tstnode5 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_update_nodeid: Local node id: 1773731850 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Creating entry for node 1773731850 born on 0 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > 0x7fd27c0023b0 Node 1773731850 now known as tstnode5 (was: (null)) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node tstnode5 now has 1 quorum votes (was 0) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node 1773731850/tstnode5 is now: member > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: > Forked child 2492 for process stonith-ng > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: > Forked child 2493 for process cib > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: > Forked child 2494 for process lrmd > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: > Forked child 2495 for process attrd > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: > Forked child 2496 for process pengine > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child: > Forked child 2497 for process crmd > Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine > loaded: Pacemaker Cluster Manager 1.1.5 > Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine > loaded: corosync extended virtual synchrony service > Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine > loaded: corosync configuration service > Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine > loaded: corosync cluster closed process group service v1.01 > Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine > loaded: corosync cluster config database access v1.01 > Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine > loaded: corosync profile loading service > Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine > loaded: corosync cluster quorum service v0.1 > Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Compatibility mode > set to whitetank. Using V1 and V2 of the synchronization engine. > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: > init_ais_connection_classic: AIS connection established > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice: > pcmk_peer_update: Transitional membership event on ring 204: memb=0, > new=0, lost=0 > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: get_ais_nodeid: > Server details: id=1773731850 uname=tstnode5 cname=pcmk > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice: > pcmk_peer_update: Stable membership event on ring 204: memb=1, new=1, > lost=0 > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: > init_ais_connection_once: Connection to 'classic openais (with plugin)': > established > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: NEW: tstnode5 1773731850 > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_new_peer: Node > tstnode5 now has id: 1773731850 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: MEMB: tstnode5 1773731850 > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_new_peer: Node > 1773731850 is now known as tstnode5 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node tstnode5 now has process list: 00000000000000000000000000111312 > (1118994) > Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: main: Starting > stonith-ng mainloop > Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] A processor joined > or left the membership and a new membership was formed. > Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] chosen downlist from > node r(0) ip(10.0.185.105) > Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Completed service > synchronization, ready to provide service. > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: > Recorded connection 0x1f7c800 for stonith-ng/2492 > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: Invoked: > /usr/lib/heartbeat/attrd > Jan 15 13:19:26 tstnode5 pengine: [2496]: info: Invoked: > /usr/lib/heartbeat/pengine > Jan 15 13:19:26 tstnode5 crmd: [2497]: info: Invoked: > /usr/lib/heartbeat/crmd > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_log_init_worker: Changed > active directory to /var/lib/heartbeat/cores/hacluster > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_log_init_worker: > Changed active directory to /var/lib/heartbeat/cores/hacluster > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Starting up > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: get_cluster_type: Cluster > type is: 'openais'. > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_cluster_connect: > Connecting to cluster infrastructure: classic openais (with plugin) > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: > init_ais_connection_classic: Creating connection to our Corosync plugin > Jan 15 13:19:26 tstnode5 cib: [2493]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.xml (digest: > /var/lib/heartbeat/crm/cib.xml.sig) > Jan 15 13:19:26 tstnode5 crmd: [2497]: info: crm_log_init_worker: > Changed active directory to /var/lib/heartbeat/cores/hacluster > Jan 15 13:19:26 tstnode5 crmd: [2497]: info: main: CRM Hg Version: > 01e86afaaa6d4a8c4836f68df80ababd6ca3902f > Jan 15 13:19:26 tstnode5 crmd: [2497]: info: crmd_init: Starting crmd > Jan 15 13:19:26 tstnode5 cib: [2493]: info: validate_with_relaxng: > Creating RNG parser context > Jan 15 13:19:26 tstnode5 lrmd: [2494]: info: enabling coredumps > Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Core dumps could be lost if > multiple dumps occur. > Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Consider setting > non-default value in /proc/sys/kernel/core_pattern (or equivalent) for > maximum supportability > Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Consider setting > /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum > supportability > Jan 15 13:19:26 tstnode5 lrmd: [2494]: info: Started. > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: > init_ais_connection_classic: AIS connection established > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: > Recorded connection 0x1f80b60 for attrd/2495 > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: get_ais_nodeid: Server > details: id=1773731850 uname=tstnode5 cname=pcmk > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: init_ais_connection_once: > Connection to 'classic openais (with plugin)': established > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_new_peer: Node > tstnode5 now has id: 1773731850 > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_new_peer: Node > 1773731850 is now known as tstnode5 > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Cluster connection > active > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Accepting attribute > updates > Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Starting mainloop... > Jan 15 13:19:26 tstnode5 cib: [2493]: info: startCib: CIB Initialization > completed successfully > Jan 15 13:19:26 tstnode5 cib: [2493]: info: get_cluster_type: Cluster > type is: 'openais'. > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_cluster_connect: > Connecting to cluster infrastructure: classic openais (with plugin) > Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_classic: > Creating connection to our Corosync plugin > Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_classic: > AIS connection established > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: > Recorded connection 0x1f84ec0 for cib/2493 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: > Sending membership update 204 to cib > Jan 15 13:19:26 tstnode5 cib: [2493]: info: get_ais_nodeid: Server > details: id=1773731850 uname=tstnode5 cname=pcmk > Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_once: > Connection to 'classic openais (with plugin)': established > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node tstnode5 > now has id: 1773731850 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node > 1773731850 is now known as tstnode5 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: cib_init: Starting cib mainloop > Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message: > Membership 204: quorum still lost > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node > tstnode5: id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105) > (new) votes=1 (new) born=0 seen=204 > proc=00000000000000000000000000111312 (new) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice: > pcmk_peer_update: Transitional membership event on ring 208: memb=1, > new=0, lost=0 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: memb: tstnode5 1773731850 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice: > pcmk_peer_update: Stable membership event on ring 208: memb=4, new=3, > lost=0 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Creating entry for node 1756954634 born on 208 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node 1756954634/unknown is now: member > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: NEW: .pending. 1756954634 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Creating entry for node 1790509066 born on 208 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node 1790509066/unknown is now: member > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: NEW: .pending. 1790509066 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Creating entry for node 1807286282 born on 208 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node 1807286282/unknown is now: member > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: NEW: .pending. 1807286282 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: MEMB: .pending. 1756954634 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message: > Membership 208: quorum still lost > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: MEMB: tstnode5 1773731850 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null> > now has id: 1756954634 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: MEMB: .pending. 1790509066 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: MEMB: .pending. 1807286282 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node > (null): id=1756954634 state=member (new) addr=r(0) ip(10.0.185.104) > votes=0 born=0 seen=208 proc=00000000000000000000000000000000 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > send_member_notification: Sending membership update 208 to 1 children > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null> > now has id: 1790509066 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > 0x7fd27c0023b0 Node 1773731850 ((null)) born on: 208 > Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] A processor joined > or left the membership and a new membership was formed. > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node > (null): id=1790509066 state=member (new) addr=r(0) ip(10.0.185.106) > votes=0 born=0 seen=208 proc=00000000000000000000000000000000 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null> > now has id: 1807286282 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node > (null): id=1807286282 state=member (new) addr=r(0) ip(10.0.185.107) > votes=0 born=0 seen=208 proc=00000000000000000000000000000000 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > 0x1f705c0 Node 1790509066 (tstnode6) born on: 196 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > 0x1f705c0 Node 1790509066 now known as tstnode6 (was: (null)) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node tstnode6 now has process list: 00000000000000000000000000013312 > (78610) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node tstnode6 now has 1 quorum votes (was 0) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > send_member_notification: Sending membership update 208 to 1 children > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: > route_ais_message: Sending message to local.crmd failed: ipc delivery > failed (rc=-2) > Jan 15 13:19:26 tstnode5 cib: [2493]: notice: ais_dispatch_message: > Membership 208: quorum acquired > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node > 1790509066 is now known as tstnode6 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > 0x1f707f0 Node 1807286282 (tstnode7) born on: 204 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node > tstnode6: id=1790509066 state=member addr=r(0) ip(10.0.185.106) votes=1 > (new) born=196 seen=208 proc=00000000000000000000000000013312 (new) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > 0x1f707f0 Node 1807286282 now known as tstnode7 (was: (null)) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node tstnode7 now has process list: 00000000000000000000000000013312 > (78610) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node tstnode7 now has 1 quorum votes (was 0) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > update_expected_votes: Expected quorum votes 2 -> 3 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > send_member_notification: Sending membership update 208 to 1 children > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: > route_ais_message: Sending message to local.crmd failed: ipc delivery > failed (rc=-2) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > 0x1f703b0 Node 1756954634 (tstnode4) born on: 196 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > 0x1f703b0 Node 1756954634 now known as tstnode4 (was: (null)) > Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message: > Membership 208: quorum retained > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node tstnode4 now has process list: 00000000000000000000000000013312 > (78610) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node tstnode4 now has 1 quorum votes (was 0) > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node > 1807286282 is now known as tstnode7 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > update_expected_votes: Expected quorum votes 3 -> 4 > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: > send_member_notification: Sending membership update 208 to 1 children > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node > tstnode7: id=1807286282 state=member addr=r(0) ip(10.0.185.107) votes=1 > (new) born=204 seen=208 proc=00000000000000000000000000013312 (new) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: > route_ais_message: Sending message to local.crmd failed: ipc delivery > failed (rc=-2) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: > route_ais_message: Sending message to local.crmd failed: ipc delivery > failed (rc=-2) > Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message: > Membership 208: quorum retained > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node > 1756954634 is now known as tstnode4 > Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node > tstnode4: id=1756954634 state=member addr=r(0) ip(10.0.185.104) votes=1 > (new) born=196 seen=208 proc=00000000000000000000000000013312 (new) > Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN: > route_ais_message: Sending message to local.crmd failed: ipc delivery > failed (rc=-2) > Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from > node 1790509066 > Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from > node 1807286282 > Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from > node 1756954634 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_cib_control: CIB > connection established > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: get_cluster_type: Cluster > type is: 'openais'. > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_cluster_connect: > Connecting to cluster infrastructure: classic openais (with plugin) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: > init_ais_connection_classic: Creating connection to our Corosync plugin > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: > init_ais_connection_classic: AIS connection established > Jan 15 13:19:27 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: > Recorded connection 0x7fd27c002800 for crmd/2497 > Jan 15 13:19:27 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc: > Sending membership update 208 to crmd > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: get_ais_nodeid: Server > details: id=1773731850 uname=tstnode5 cname=pcmk > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: init_ais_connection_once: > Connection to 'classic openais (with plugin)': established > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode5 > now has id: 1773731850 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node > 1773731850 is now known as tstnode5 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode5 is now unknown > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_ha_control: Connected to > the cluster > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: Delaying start, > no membership data (0000000000100000) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crmd_init: Starting crmd's > mainloop > Jan 15 13:19:27 tstnode5 crmd: [2497]: notice: ais_dispatch_message: > Membership 208: quorum acquired > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode4 > now has id: 1756954634 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node > 1756954634 is now known as tstnode4 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode4 is now unknown > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode4 is now member (was unknown) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node > tstnode4: id=1756954634 state=member (new) addr=r(0) ip(10.0.185.104) > votes=1 born=196 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:27 tstnode5 crmd: [2497]: notice: crmd_peer_update: Status > update: Client tstnode5/crmd now has status [online] (DC=<null>) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode5 is now member (was unknown) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node > tstnode5: id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105) > (new) votes=1 (new) born=208 seen=208 > proc=00000000000000000000000000111312 (new) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode6 > now has id: 1790509066 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node > 1790509066 is now known as tstnode6 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode6 is now unknown > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode6 is now member (was unknown) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node > tstnode6: id=1790509066 state=member (new) addr=r(0) ip(10.0.185.106) > votes=1 born=196 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode7 > now has id: 1807286282 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node > 1807286282 is now known as tstnode7 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode7 is now unknown > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode7 is now member (was unknown) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node > tstnode7: id=1807286282 state=member (new) addr=r(0) ip(10.0.185.107) > votes=1 born=204 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: Delaying start, > Config not read (0000000000000040) > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback: > Shutdown escalation occurs after: 1200000ms > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback: > Checking for expired actions every 900000ms > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback: > Sending expected-votes=4 to corosync > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: The local CRM > is operational > Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_state_transition: State > transition S_STARTING -> S_PENDING [ input=I_PENDING > cause=C_FSA_INTERNAL origin=do_started ] > Jan 15 13:19:27 tstnode5 corosync[2484]: [TOTEM ] A processor failed, > forming new configuration. > Jan 15 13:19:28 tstnode5 crmd: [2497]: info: ais_dispatch_message: > Membership 208: quorum retained > Jan 15 13:19:28 tstnode5 crmd: [2497]: info: te_connect_stonith: > Attempting connection to fencing daemon... > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] notice: > pcmk_peer_update: Transitional membership event on ring 212: memb=1, > new=0, lost=3 > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: memb: tstnode5 1773731850 > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: lost: tstnode4 1756954634 > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: lost: tstnode6 1790509066 > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: lost: tstnode7 1807286282 > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] notice: > pcmk_peer_update: Stable membership event on ring 212: memb=1, new=0, > lost=0 > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > pcmk_peer_update: MEMB: tstnode5 1773731850 > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > ais_mark_unseen_peer_dead: Node tstnode4 was not seen in the previous > transition > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node 1756954634/tstnode4 is now: lost > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > ais_mark_unseen_peer_dead: Node tstnode6 was not seen in the previous > transition > Jan 15 13:19:28 tstnode5 cib: [2493]: notice: ais_dispatch_message: > Membership 212: quorum lost > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node 1790509066/tstnode6 is now: lost > Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node > tstnode4: id=1756954634 state=lost (new) addr=r(0) ip(10.0.185.104) > votes=1 born=196 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > ais_mark_unseen_peer_dead: Node tstnode7 was not seen in the previous > transition > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member: > Node 1807286282/tstnode7 is now: lost > Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: > send_member_notification: Sending membership update 212 to 2 children > Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node > tstnode6: id=1790509066 state=lost (new) addr=r(0) ip(10.0.185.106) > votes=1 born=196 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:28 tstnode5 corosync[2484]: [TOTEM ] A processor joined > or left the membership and a new membership was formed. > Jan 15 13:19:28 tstnode5 corosync[2484]: [CPG ] chosen downlist from > node r(0) ip(10.0.185.105) > Jan 15 13:19:28 tstnode5 corosync[2484]: [MAIN ] Completed service > synchronization, ready to provide service. > Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node > tstnode7: id=1807286282 state=lost (new) addr=r(0) ip(10.0.185.107) > votes=1 born=204 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:29 tstnode5 crmd: [2497]: info: te_connect_stonith: Connected > Jan 15 13:19:29 tstnode5 crmd: [2497]: notice: ais_dispatch_message: > Membership 212: quorum lost > Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode4 is now lost (was member) > Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node > tstnode4: id=1756954634 state=lost (new) addr=r(0) ip(10.0.185.104) > votes=1 born=196 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode6 is now lost (was member) > Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node > tstnode6: id=1790509066 state=lost (new) addr=r(0) ip(10.0.185.106) > votes=1 born=196 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback: > status: tstnode7 is now lost (was member) > Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node > tstnode7: id=1807286282 state=lost (new) addr=r(0) ip(10.0.185.107) > votes=1 born=204 seen=208 proc=00000000000000000000000000013312 > Jan 15 13:19:31 tstnode5 attrd: [2495]: info: cib_connect: Connected to > the CIB after 1 signon attempts > Jan 15 13:19:31 tstnode5 attrd: [2495]: info: cib_connect: Sending full > refresh > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crm_timer_popped: Election > Trigger (I_DC_TIMEOUT) just popped! (60000ms) > Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: do_log: FSA: Input > I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State > transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT > cause=C_TIMER_POPPED origin=crm_timer_popped ] > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State > transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC > cause=C_FSA_INTERNAL origin=do_election_check ] > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_te_control: Registering > TE UUID: fef69ffc-bef2-475d-b3c4-ddaa94ab8af6 > Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: > cib_client_add_notify_callback: Callback already present > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: set_graph_functions: > Setting custom graph functions > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: unpack_graph: Unpacked > transition -1: 0 actions in 0 synapses > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_takeover: Taking over > DC status for this partition > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_readwrite: We > are now in R/W mode > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_master for section 'all' > (origin=local/crmd/5, version=0.12.1): ok (rc=0) > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_modify for section cib (origin=local/crmd/6, > version=0.12.2): ok (rc=0) > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_modify for section crm_config > (origin=local/crmd/8, version=0.12.3): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: join_make_offer: Making > join offers based on membership 212 > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_offer_all: > join-1: Waiting on 1 outstanding join acks > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_modify for section crm_config > (origin=local/crmd/10, version=0.12.4): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback: > Shutdown escalation occurs after: 1200000ms > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback: > Checking for expired actions every 900000ms > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback: > Sending expected-votes=4 to corosync > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: ais_dispatch_message: > Membership 212: quorum still lost > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crmd_ais_dispatch: Setting > expected votes to 4 > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: update_dc: Set DC to > tstnode5 (3.0.5) > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_modify for section crm_config > (origin=local/crmd/13, version=0.12.5): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: ais_dispatch_message: > Membership 212: quorum still lost > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crmd_ais_dispatch: Setting > expected votes to 4 > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_modify for section crm_config > (origin=local/crmd/16, version=0.12.6): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State > transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED > cause=C_FSA_INTERNAL origin=check_join_state ] > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: All 1 > cluster nodes responded to the join offer. > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_finalize: > join-1: Syncing the CIB from tstnode5 to the rest of the cluster > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_sync for section 'all' (origin=local/crmd/17, > version=0.12.6): ok (rc=0) > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_modify for section nodes > (origin=local/crmd/18, version=0.12.7): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: update_attrd: Connecting to > attrd... > Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating > hash entry for terminate > Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating > hash entry for shutdown > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_delete for section > //node_state[@uname='tstnode5']/transient_attributes > (origin=local/crmd/19, version=0.12.8): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: erase_xpath_callback: > Deletion of "//node_state[@uname='tstnode5']/transient_attributes": ok > (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_ack: join-1: > Updating node state to member for tstnode5 > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_delete for section > //node_state[@uname='tstnode5']/lrm (origin=local/crmd/20, > version=0.12.9): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: erase_xpath_callback: > Deletion of "//node_state[@uname='tstnode5']/lrm": ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State > transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED > cause=C_FSA_INTERNAL origin=check_join_state ] > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: All 1 > cluster nodes are eligible to run resources. > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_final: Ensuring > DC, quorum and node attributes are up-to-date > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crm_update_quorum: Updating > quorum status to false (call=24) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph: > do_te_invoke:173 - Triggered transition abort (complete=1) : Peer Cancelled > Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_local_callback: > Sending full refresh (origin=crmd) > Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update: > Sending flush op to all hosts for: terminate (<null>) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 25: > Requesting the current CIB: S_POLICY_ENGINE > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_modify for section nodes > (origin=local/crmd/22, version=0.12.11): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match > for shutdown action on tstnode4 > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff: > Stonith/shutdown of tstnode4 not matched > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph: > te_update_diff:200 - Triggered transition abort (complete=1, > tag=node_state, id=tstnode4, magic=NA, cib=0.12.12) : Node failure > Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match > for shutdown action on tstnode6 > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff: > Stonith/shutdown of tstnode6 not matched > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph: > te_update_diff:200 - Triggered transition abort (complete=1, > tag=node_state, id=tstnode6, magic=NA, cib=0.12.12) : Node failure > Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match > for shutdown action on tstnode7 > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff: > Stonith/shutdown of tstnode7 not matched > Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request: > Operation complete: op cib_modify for section cib (origin=local/crmd/24, > version=0.12.13): ok (rc=0) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph: > te_update_diff:200 - Triggered transition abort (complete=1, > tag=node_state, id=tstnode7, magic=NA, cib=0.12.12) : Node failure > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 26: > Requesting the current CIB: S_POLICY_ENGINE > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 27: > Requesting the current CIB: S_POLICY_ENGINE > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 28: > Requesting the current CIB: S_POLICY_ENGINE > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke_callback: > Invoking the PE: query=28, ref=pe_calc-dc-1358256028-8, seq=212, quorate=0 > Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update: > Sending flush op to all hosts for: shutdown (<null>) > Jan 15 13:20:28 tstnode5 pengine: [2496]: WARN: cluster_status: We do > not have quorum - fencing and resource management disabled > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: unpack_graph: Unpacked > transition 0: 1 actions in 1 synapses > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_te_invoke: Processing > graph 0 (ref=pe_calc-dc-1358256028-8) derived from > /var/lib/pengine/pe-input-30.bz2 > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_rsc_command: Initiating > action 2: probe_complete probe_complete on tstnode5 (local) - no waiting > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: run_graph: > ==================================================== > Jan 15 13:20:28 tstnode5 crmd: [2497]: notice: run_graph: Transition 0 > (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-30.bz2): Complete > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_graph_trigger: > Transition 0 is now complete > Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating > hash entry for probe_complete > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: notify_crmd: Transition 0 > status: done - <null> > Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update: > Sending flush op to all hosts for: probe_complete (true) > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: > Starting PEngine Recheck Timer > Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_perform_update: Sent > update 8: probe_complete=true > > _______________________________________________ > discuss mailing list > discuss@xxxxxxxxxxxx > http://lists.corosync.org/mailman/listinfo/discuss _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss