Hi,
We are planning to upgrade a pacemaker cluster on Ubuntu natty (11.04) to
oneiric (11.10) which involves moving from corosync version 1.2.1 to 1.3.0,
and pacemaker from version 1.0.10 to 1.1.5. We decided to test the
process by
creating a simple cluster running natty and introduce a new node running
oneiric. When we did this the corosync services on all the natty nodes
crashed.
Is this something that we should be able to do? ie should nodes running
version
1.2.1 of corosync interoperate with nodes running version 1.3.0 in the same
corosync cluster?
Any suggestions for how to do a roling upgrade of a pacemaker/corosync
cluster?
If this is not the correct place for this question, is there a better forum?
Thanks
Ollie
Syslogs from the nodes is below.
Before starting corosync on oneiric we have three natty nodes
(tstnode4,6,7) in
a cluster with no resources configured. Corosync version 1.3.0 is
installed on
tstnode5, but the service has not been started yet:
root@tstnode4:~# crm status
============
Last updated: Tue Jan 15 11:35:12 2013
Stack: openais
Current DC: tstnode4 - partition with quorum
Version: 1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3
3 Nodes configured, 3 expected votes
0 Resources configured.
============
Online: [ tstnode4 tstnode6 tstnode7 ]
root@tstnode4:~#
We started corosync no tstnode5 at about 13:19 , here is syslog on tstnode4,
showing the corosync service stopping:
Jan 15 13:17:01 tstnode4 CRON[16754]: (root) CMD ( cd / && run-parts
--report /etc/cron.hourly)
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 208: memb=3,
new=0, lost=0
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
pcmk_peer_update: memb: tstnode4 1756954634
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
pcmk_peer_update: memb: tstnode6 1790509066
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
pcmk_peer_update: memb: tstnode7 1807286282
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 208: memb=4, new=1, lost=0
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
update_member: Creating entry for node 1773731850 born on 208
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
update_member: Node 1773731850/unknown is now: member
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
pcmk_peer_update: NEW: .pending. 1773731850
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
pcmk_peer_update: MEMB: tstnode4 1756954634
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
pcmk_peer_update: MEMB: .pending. 1773731850
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
pcmk_peer_update: MEMB: tstnode6 1790509066
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
pcmk_peer_update: MEMB: tstnode7 1807286282
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
send_member_notification: Sending membership update 208 to 2 children
Jan 15 13:19:26 tstnode4 corosync[16656]: [TOTEM ] A processor joined
or left the membership and a new membership was formed.
Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_dispatch: Membership
208: quorum retained
Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_new_peer: Node <null>
now has id: 1773731850
Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_update_peer: Node
(null): id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105)
votes=0 born=0 seen=208 proc=00000000000000000000000000000000
Jan 15 13:19:26 tstnode4 cib: [16665]: info: ais_dispatch: Membership
208: quorum retained
Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_new_peer: Node <null>
now has id: 1773731850
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
update_member: 0x20388e0 Node 1773731850 (tstnode5) born on: 208
Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_update_peer: Node
(null): id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105)
votes=0 born=0 seen=208 proc=00000000000000000000000000000000
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
update_member: 0x20388e0 Node 1773731850 now known as tstnode5 (was: (null))
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
update_member: Node tstnode5 now has process list:
00000000000000000000000000111312 (1118994)
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
update_member: Node tstnode5 now has 1 quorum votes (was 0)
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
update_expected_votes: Expected quorum votes 3 -> 4
Jan 15 13:19:26 tstnode4 corosync[16656]: [pcmk ] info:
send_member_notification: Sending membership update 208 to 2 children
Jan 15 13:19:26 tstnode4 cib: [16665]: info: ais_dispatch: Membership
208: quorum retained
Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_get_peer: Node
1773731850 is now known as tstnode5
Jan 15 13:19:26 tstnode4 cib: [16665]: info: crm_update_peer: Node
tstnode5: id=1773731850 state=member addr=r(0) ip(10.0.185.105) votes=1
(new) born=208 seen=208 proc=00000000000000000000000000111312 (new)
Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request:
Operation complete: op cib_modify for section nodes
(origin=local/crmd/37, version=0.15.4): ok (rc=0)
Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_ais_dispatch: Setting
expected votes to 3
Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_dispatch: Membership
208: quorum retained
Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_get_peer: Node
1773731850 is now known as tstnode5
Jan 15 13:19:26 tstnode4 crmd: [16669]: info: ais_status_callback:
status: tstnode5 is now member
Jan 15 13:19:26 tstnode4 crmd: [16669]: info: crm_update_peer: Node
tstnode5: id=1773731850 state=member addr=r(0) ip(10.0.185.105) votes=1
(new) born=208 seen=208 proc=00000000000000000000000000111312 (new)
Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request:
Operation complete: op cib_modify for section crm_config
(origin=local/crmd/40, version=0.15.4): ok (rc=0)
Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff:
- <cib admin_epoch="0" epoch="15" num_updates="4" />
Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff:
+ <cib admin_epoch="0" epoch="16" num_updates="1" >
Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff:
+ <configuration >
Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff:
+ <nodes >
Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff:
+ <node id="tstnode5" uname="tstnode5" type="normal"
__crm_diff_marker__="added:top" />
Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff:
+ </nodes>
Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff:
+ </configuration>
Jan 15 13:19:26 tstnode4 cib: [16665]: info: log_data_element: cib:diff:
+ </cib>
Jan 15 13:19:26 tstnode4 cib: [16665]: info: cib_process_request:
Operation complete: op cib_modify for section nodes
(origin=local/crmd/41, version=0.16.1): ok (rc=0)
Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: ais_dispatch:
Receiving message body failed: (2) Library error: Resource temporarily
unavailable (11)
Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: ais_dispatch: AIS
connection failed
Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: Receiving
message body failed: (2) Library error: Resource temporarily unavailable
(11)
Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: Receiving
message body failed: (2) Library error: Resource temporarily unavailable
(11)
Jan 15 13:19:27 tstnode4 stonithd: [16664]: ERROR: AIS connection terminated
Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR: ais_dispatch: AIS
connection failed
Jan 15 13:19:27 tstnode4 attrd: [16667]: CRIT: attrd_ais_destroy: Lost
connection to OpenAIS service!
Jan 15 13:19:27 tstnode4 attrd: [16667]: info: main: Exiting...
Jan 15 13:19:27 tstnode4 attrd: [16667]: ERROR:
attrd_cib_connection_destroy: Connection to the CIB terminated...
Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: send_ais_text: Sending
message 24: FAILED (rc=2): Library error: Connection timed out (110)
Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: send_ais_text: Sending
message 20: FAILED (rc=2): Library error: Connection timed out (110)
Jan 15 13:19:28 tstnode4 crmd: [16615]: ERROR: send_ais_text: Sending
message 0: FAILED (rc=2): Library error: Connection timed out (110)
Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: send_ais_message: Not
connected to AIS
Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: ais_dispatch: Receiving
message body failed: (2) Library error: Resource temporarily unavailable
(11)
Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: ais_dispatch: AIS
connection failed
Jan 15 13:19:28 tstnode4 cib: [16665]: ERROR: cib_ais_destroy: AIS
connection terminated
Jan 15 13:19:28 tstnode4 crmd: [16669]: info: crm_ais_dispatch: Setting
expected votes to 4
Jan 15 13:19:28 tstnode4 crmd: [16669]: info: abort_transition_graph:
need_abort:59 - Triggered transition abort (complete=1) : Non-status change
Jan 15 13:19:28 tstnode4 crmd: [16669]: info: need_abort: Aborting on
change to admin_epoch
Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: ais_dispatch: Receiving
message body failed: (2) Library error: Resource temporarily unavailable
(11)
Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: ais_dispatch: AIS
connection failed
Jan 15 13:19:28 tstnode4 crmd: [16669]: ERROR: crm_ais_destroy: AIS
connection terminated
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: get_ais_nodeid: Sending
nodeid request: FAILED (rc=2): Library error
Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: init_ais_connection_once:
Node name mismatch! OpenAIS supplied (null), our lookup returned tstnode4
Jan 15 13:19:29 tstnode4 crmd: [16615]: notice:
init_ais_connection_once: Node name mismatches usually occur when
assigned automatically by DHCP servers
Jan 15 13:19:29 tstnode4 crmd: [16615]: notice:
init_ais_connection_once: If this node was part of the cluster with a
different name, you will need to remove the old entry with crm_node --remove
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: crm_abort: crm_strdup_fn:
Triggered assert at utils.c:782 : src != NULL
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_ha_control: Connected
to the cluster
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_started: Delaying
start, CCM (0000000000100000) not connected
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: send_ipc_message: IPC
Channel to 16665 is not connected
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: cib_native_perform_op:
Sending message to CIB service FAILED
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: config_query_callback:
Local CIB query resulted in an error: send failed
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: cib_native_msgready: Lost
connection to the CIB service [16665].
Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: cib_native_dispatch: Lost
connection to the CIB service [16665/callback].
Jan 15 13:19:29 tstnode4 crmd: [16615]: CRIT: cib_native_dispatch: Lost
connection to the CIB service [16665/command].
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR:
crmd_cib_connection_destroy: Connection to the CIB terminated...
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input
I_ERROR from config_query_callback() received in state S_STARTING
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_state_transition: State
transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL
origin=config_query_callback ]
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_recover: Action
A_RECOVER (0000000001000000) not supported
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_started: Start
cancelled... S_RECOVERY
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input
I_ERROR from crmd_cib_connection_destroy() received in state S_RECOVERY
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_dc_release: DC role
released
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_te_control:
Transitioner is now inactive
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_log: FSA: Input
I_TERMINATE from do_recover() received in state S_RECOVERY
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_state_transition: State
transition S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE
cause=C_FSA_INTERNAL origin=do_recover ]
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_lrm_control:
Disconnected from the LRM
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_ha_control:
Disconnected from OpenAIS
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_cib_control:
Disconnecting CIB
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_exit: Performing
A_EXIT_0 - gracefully exiting the CRMd
Jan 15 13:19:29 tstnode4 crmd: [16615]: ERROR: do_exit: Could not
recover from internal error
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: free_mem: Dropping
I_RELEASE_SUCCESS: [ state=S_TERMINATE cause=C_FSA_INTERNAL
origin=do_dc_release ]
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: free_mem: Dropping
I_TERMINATE: [ state=S_TERMINATE cause=C_FSA_INTERNAL origin=do_stop ]
Jan 15 13:19:29 tstnode4 crmd: [16615]: info: do_exit: [crmd] stopped (2)
At this time corosync has crashed on the three natty nodes and this is
how tstnode5 sees the cluster:
root@tstnode5:~# crm status
============
Last updated: Tue Jan 15 13:11:01 2013
Stack: openais
Current DC: tstnode5 - partition WITHOUT quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
4 Nodes configured, 4 expected votes
0 Resources configured.
============
Online: [ tstnode5 ]
OFFLINE: [ tstnode6 tstnode7 tstnode4 ]
And the cluster configuration is: (we disabled stonith because it is not
installed on this test cluster.)
root@tstnode5:~# crm configure show
node tstnode4
node tstnode5
node tstnode6
node tstnode7
property $id="cib-bootstrap-options" \
dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="4" \
stonith-enabled="false"
root@tstnode5:~#
root@tstnode5:~# crm_verify -LV
crm_verify[2534]: 2013/01/15_13:30:33 WARN: cluster_status: We do not
have quorum - fencing and resource management disabled
root@tstnode5:~#
Here is syslog on tstnode5 at the corresponding time:
Jan 15 13:17:01 tstnode5 CRON[2464]: (root) CMD ( cd / && run-parts
--report /etc/cron.hourly)
Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Corosync Cluster
Engine ('1.3.0'): started and ready to provide service.
Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Corosync built-in
features: nss
Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Successfully read
main configuration file '/etc/corosync/corosync.conf'.
Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] Initializing
transport (UDP/IP Multicast).
Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] Initializing
transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] The network
interface [10.0.185.105] is now up.
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
process_ais_conf: Reading configure
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
config_find_init: Local handle: 2013064636357672963 for logging
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
config_find_next: Processing additional logging options...
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Found 'off' for option: debug
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Found 'yes' for option: to_logfile
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Found '/var/log/corosync/corosync.log' for option: logfile
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Found 'yes' for option: to_syslog
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Defaulting to 'daemon' for option: syslog_facility
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
config_find_init: Local handle: 4730966301143465988 for quorum
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
config_find_next: No additional configuration supplied for: quorum
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: No default for option: provider
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
config_find_init: Local handle: 7739444317642555397 for service
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
config_find_next: Processing additional service options...
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Found '0' for option: ver
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Defaulting to 'pcmk' for option: clustername
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: Invoked:
/usr/lib/heartbeat/stonithd
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_log_init_worker:
Changed active directory to /var/lib/heartbeat/cores/root
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: get_cluster_type:
Cluster type is: 'openais'.
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_cluster_connect:
Connecting to cluster infrastructure: classic openais (with plugin)
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info:
init_ais_connection_classic: Creating connection to our Corosync plugin
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Defaulting to 'no' for option: use_logd
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
get_config_opt: Defaulting to 'no' for option: use_mgmtd
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup:
CRM: Initialized
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] Logging: Initialized
pcmk_startup
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup:
Maximum core file size is: 18446744073709551615
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup:
Service: 9
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_startup:
Local hostname: tstnode5
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_update_nodeid: Local node id: 1773731850
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Creating entry for node 1773731850 born on 0
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
0x7fd27c0023b0 Node 1773731850 now known as tstnode5 (was: (null))
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node tstnode5 now has 1 quorum votes (was 0)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node 1773731850/tstnode5 is now: member
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child:
Forked child 2492 for process stonith-ng
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child:
Forked child 2493 for process cib
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child:
Forked child 2494 for process lrmd
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child:
Forked child 2495 for process attrd
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child:
Forked child 2496 for process pengine
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: spawn_child:
Forked child 2497 for process crmd
Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine
loaded: Pacemaker Cluster Manager 1.1.5
Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine
loaded: corosync extended virtual synchrony service
Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine
loaded: corosync configuration service
Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine
loaded: corosync cluster closed process group service v1.01
Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine
loaded: corosync cluster config database access v1.01
Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine
loaded: corosync profile loading service
Jan 15 13:19:26 tstnode5 corosync[2484]: [SERV ] Service engine
loaded: corosync cluster quorum service v0.1
Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Compatibility mode
set to whitetank. Using V1 and V2 of the synchronization engine.
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info:
init_ais_connection_classic: AIS connection established
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 204: memb=0,
new=0, lost=0
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: get_ais_nodeid:
Server details: id=1773731850 uname=tstnode5 cname=pcmk
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 204: memb=1, new=1, lost=0
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info:
init_ais_connection_once: Connection to 'classic openais (with plugin)':
established
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: NEW: tstnode5 1773731850
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_new_peer: Node
tstnode5 now has id: 1773731850
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: MEMB: tstnode5 1773731850
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: crm_new_peer: Node
1773731850 is now known as tstnode5
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node tstnode5 now has process list: 00000000000000000000000000111312
(1118994)
Jan 15 13:19:26 tstnode5 stonith-ng: [2492]: info: main: Starting
stonith-ng mainloop
Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] A processor joined
or left the membership and a new membership was formed.
Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] chosen downlist from
node r(0) ip(10.0.185.105)
Jan 15 13:19:26 tstnode5 corosync[2484]: [MAIN ] Completed service
synchronization, ready to provide service.
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc:
Recorded connection 0x1f7c800 for stonith-ng/2492
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: Invoked:
/usr/lib/heartbeat/attrd
Jan 15 13:19:26 tstnode5 pengine: [2496]: info: Invoked:
/usr/lib/heartbeat/pengine
Jan 15 13:19:26 tstnode5 crmd: [2497]: info: Invoked:
/usr/lib/heartbeat/crmd
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_log_init_worker: Changed
active directory to /var/lib/heartbeat/cores/hacluster
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_log_init_worker:
Changed active directory to /var/lib/heartbeat/cores/hacluster
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Starting up
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: get_cluster_type: Cluster
type is: 'openais'.
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_cluster_connect:
Connecting to cluster infrastructure: classic openais (with plugin)
Jan 15 13:19:26 tstnode5 attrd: [2495]: info:
init_ais_connection_classic: Creating connection to our Corosync plugin
Jan 15 13:19:26 tstnode5 cib: [2493]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
/var/lib/heartbeat/crm/cib.xml.sig)
Jan 15 13:19:26 tstnode5 crmd: [2497]: info: crm_log_init_worker:
Changed active directory to /var/lib/heartbeat/cores/hacluster
Jan 15 13:19:26 tstnode5 crmd: [2497]: info: main: CRM Hg Version:
01e86afaaa6d4a8c4836f68df80ababd6ca3902f
Jan 15 13:19:26 tstnode5 crmd: [2497]: info: crmd_init: Starting crmd
Jan 15 13:19:26 tstnode5 cib: [2493]: info: validate_with_relaxng:
Creating RNG parser context
Jan 15 13:19:26 tstnode5 lrmd: [2494]: info: enabling coredumps
Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Core dumps could be lost if
multiple dumps occur.
Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Consider setting
non-default value in /proc/sys/kernel/core_pattern (or equivalent) for
maximum supportability
Jan 15 13:19:26 tstnode5 lrmd: [2494]: WARN: Consider setting
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
supportability
Jan 15 13:19:26 tstnode5 lrmd: [2494]: info: Started.
Jan 15 13:19:26 tstnode5 attrd: [2495]: info:
init_ais_connection_classic: AIS connection established
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc:
Recorded connection 0x1f80b60 for attrd/2495
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: get_ais_nodeid: Server
details: id=1773731850 uname=tstnode5 cname=pcmk
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: init_ais_connection_once:
Connection to 'classic openais (with plugin)': established
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_new_peer: Node
tstnode5 now has id: 1773731850
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: crm_new_peer: Node
1773731850 is now known as tstnode5
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Cluster connection
active
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Accepting attribute
updates
Jan 15 13:19:26 tstnode5 attrd: [2495]: info: main: Starting mainloop...
Jan 15 13:19:26 tstnode5 cib: [2493]: info: startCib: CIB Initialization
completed successfully
Jan 15 13:19:26 tstnode5 cib: [2493]: info: get_cluster_type: Cluster
type is: 'openais'.
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_cluster_connect:
Connecting to cluster infrastructure: classic openais (with plugin)
Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_classic:
Creating connection to our Corosync plugin
Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_classic:
AIS connection established
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc:
Recorded connection 0x1f84ec0 for cib/2493
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc:
Sending membership update 204 to cib
Jan 15 13:19:26 tstnode5 cib: [2493]: info: get_ais_nodeid: Server
details: id=1773731850 uname=tstnode5 cname=pcmk
Jan 15 13:19:26 tstnode5 cib: [2493]: info: init_ais_connection_once:
Connection to 'classic openais (with plugin)': established
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node tstnode5
now has id: 1773731850
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node
1773731850 is now known as tstnode5
Jan 15 13:19:26 tstnode5 cib: [2493]: info: cib_init: Starting cib mainloop
Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message:
Membership 204: quorum still lost
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node
tstnode5: id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105)
(new) votes=1 (new) born=0 seen=204
proc=00000000000000000000000000111312 (new)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 208: memb=1,
new=0, lost=0
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: memb: tstnode5 1773731850
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 208: memb=4, new=3, lost=0
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Creating entry for node 1756954634 born on 208
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node 1756954634/unknown is now: member
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: NEW: .pending. 1756954634
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Creating entry for node 1790509066 born on 208
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node 1790509066/unknown is now: member
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: NEW: .pending. 1790509066
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Creating entry for node 1807286282 born on 208
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node 1807286282/unknown is now: member
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: NEW: .pending. 1807286282
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: MEMB: .pending. 1756954634
Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message:
Membership 208: quorum still lost
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: MEMB: tstnode5 1773731850
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null>
now has id: 1756954634
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: MEMB: .pending. 1790509066
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: MEMB: .pending. 1807286282
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node
(null): id=1756954634 state=member (new) addr=r(0) ip(10.0.185.104)
votes=0 born=0 seen=208 proc=00000000000000000000000000000000
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
send_member_notification: Sending membership update 208 to 1 children
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null>
now has id: 1790509066
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
0x7fd27c0023b0 Node 1773731850 ((null)) born on: 208
Jan 15 13:19:26 tstnode5 corosync[2484]: [TOTEM ] A processor joined
or left the membership and a new membership was formed.
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node
(null): id=1790509066 state=member (new) addr=r(0) ip(10.0.185.106)
votes=0 born=0 seen=208 proc=00000000000000000000000000000000
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_new_peer: Node <null>
now has id: 1807286282
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node
(null): id=1807286282 state=member (new) addr=r(0) ip(10.0.185.107)
votes=0 born=0 seen=208 proc=00000000000000000000000000000000
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
0x1f705c0 Node 1790509066 (tstnode6) born on: 196
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
0x1f705c0 Node 1790509066 now known as tstnode6 (was: (null))
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node tstnode6 now has process list: 00000000000000000000000000013312 (78610)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node tstnode6 now has 1 quorum votes (was 0)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
send_member_notification: Sending membership update 208 to 1 children
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN:
route_ais_message: Sending message to local.crmd failed: ipc delivery
failed (rc=-2)
Jan 15 13:19:26 tstnode5 cib: [2493]: notice: ais_dispatch_message:
Membership 208: quorum acquired
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node
1790509066 is now known as tstnode6
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
0x1f707f0 Node 1807286282 (tstnode7) born on: 204
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node
tstnode6: id=1790509066 state=member addr=r(0) ip(10.0.185.106) votes=1
(new) born=196 seen=208 proc=00000000000000000000000000013312 (new)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
0x1f707f0 Node 1807286282 now known as tstnode7 (was: (null))
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node tstnode7 now has process list: 00000000000000000000000000013312 (78610)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node tstnode7 now has 1 quorum votes (was 0)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
update_expected_votes: Expected quorum votes 2 -> 3
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
send_member_notification: Sending membership update 208 to 1 children
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN:
route_ais_message: Sending message to local.crmd failed: ipc delivery
failed (rc=-2)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
0x1f703b0 Node 1756954634 (tstnode4) born on: 196
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
0x1f703b0 Node 1756954634 now known as tstnode4 (was: (null))
Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message:
Membership 208: quorum retained
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node tstnode4 now has process list: 00000000000000000000000000013312 (78610)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node tstnode4 now has 1 quorum votes (was 0)
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node
1807286282 is now known as tstnode7
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
update_expected_votes: Expected quorum votes 3 -> 4
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] info:
send_member_notification: Sending membership update 208 to 1 children
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node
tstnode7: id=1807286282 state=member addr=r(0) ip(10.0.185.107) votes=1
(new) born=204 seen=208 proc=00000000000000000000000000013312 (new)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN:
route_ais_message: Sending message to local.crmd failed: ipc delivery
failed (rc=-2)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN:
route_ais_message: Sending message to local.crmd failed: ipc delivery
failed (rc=-2)
Jan 15 13:19:26 tstnode5 cib: [2493]: info: ais_dispatch_message:
Membership 208: quorum retained
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_get_peer: Node
1756954634 is now known as tstnode4
Jan 15 13:19:26 tstnode5 cib: [2493]: info: crm_update_peer: Node
tstnode4: id=1756954634 state=member addr=r(0) ip(10.0.185.104) votes=1
(new) born=196 seen=208 proc=00000000000000000000000000013312 (new)
Jan 15 13:19:26 tstnode5 corosync[2484]: [pcmk ] WARN:
route_ais_message: Sending message to local.crmd failed: ipc delivery
failed (rc=-2)
Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from
node 1790509066
Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from
node 1807286282
Jan 15 13:19:26 tstnode5 corosync[2484]: [CPG ] downlist OLD from
node 1756954634
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_cib_control: CIB
connection established
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: get_cluster_type: Cluster
type is: 'openais'.
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_cluster_connect:
Connecting to cluster infrastructure: classic openais (with plugin)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info:
init_ais_connection_classic: Creating connection to our Corosync plugin
Jan 15 13:19:27 tstnode5 crmd: [2497]: info:
init_ais_connection_classic: AIS connection established
Jan 15 13:19:27 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc:
Recorded connection 0x7fd27c002800 for crmd/2497
Jan 15 13:19:27 tstnode5 corosync[2484]: [pcmk ] info: pcmk_ipc:
Sending membership update 208 to crmd
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: get_ais_nodeid: Server
details: id=1773731850 uname=tstnode5 cname=pcmk
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: init_ais_connection_once:
Connection to 'classic openais (with plugin)': established
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode5
now has id: 1773731850
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node
1773731850 is now known as tstnode5
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode5 is now unknown
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_ha_control: Connected to
the cluster
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: Delaying start,
no membership data (0000000000100000)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crmd_init: Starting crmd's
mainloop
Jan 15 13:19:27 tstnode5 crmd: [2497]: notice: ais_dispatch_message:
Membership 208: quorum acquired
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode4
now has id: 1756954634
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node
1756954634 is now known as tstnode4
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode4 is now unknown
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode4 is now member (was unknown)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node
tstnode4: id=1756954634 state=member (new) addr=r(0) ip(10.0.185.104)
votes=1 born=196 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:27 tstnode5 crmd: [2497]: notice: crmd_peer_update: Status
update: Client tstnode5/crmd now has status [online] (DC=<null>)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode5 is now member (was unknown)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node
tstnode5: id=1773731850 state=member (new) addr=r(0) ip(10.0.185.105)
(new) votes=1 (new) born=208 seen=208
proc=00000000000000000000000000111312 (new)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode6
now has id: 1790509066
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node
1790509066 is now known as tstnode6
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode6 is now unknown
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode6 is now member (was unknown)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node
tstnode6: id=1790509066 state=member (new) addr=r(0) ip(10.0.185.106)
votes=1 born=196 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node tstnode7
now has id: 1807286282
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_new_peer: Node
1807286282 is now known as tstnode7
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode7 is now unknown
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode7 is now member (was unknown)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: crm_update_peer: Node
tstnode7: id=1807286282 state=member (new) addr=r(0) ip(10.0.185.107)
votes=1 born=204 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: Delaying start,
Config not read (0000000000000040)
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback:
Shutdown escalation occurs after: 1200000ms
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback:
Checking for expired actions every 900000ms
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: config_query_callback:
Sending expected-votes=4 to corosync
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_started: The local CRM
is operational
Jan 15 13:19:27 tstnode5 crmd: [2497]: info: do_state_transition: State
transition S_STARTING -> S_PENDING [ input=I_PENDING
cause=C_FSA_INTERNAL origin=do_started ]
Jan 15 13:19:27 tstnode5 corosync[2484]: [TOTEM ] A processor failed,
forming new configuration.
Jan 15 13:19:28 tstnode5 crmd: [2497]: info: ais_dispatch_message:
Membership 208: quorum retained
Jan 15 13:19:28 tstnode5 crmd: [2497]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 212: memb=1,
new=0, lost=3
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: memb: tstnode5 1773731850
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: lost: tstnode4 1756954634
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: lost: tstnode6 1790509066
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: lost: tstnode7 1807286282
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 212: memb=1, new=0, lost=0
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
pcmk_peer_update: MEMB: tstnode5 1773731850
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
ais_mark_unseen_peer_dead: Node tstnode4 was not seen in the previous
transition
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node 1756954634/tstnode4 is now: lost
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
ais_mark_unseen_peer_dead: Node tstnode6 was not seen in the previous
transition
Jan 15 13:19:28 tstnode5 cib: [2493]: notice: ais_dispatch_message:
Membership 212: quorum lost
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node 1790509066/tstnode6 is now: lost
Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node
tstnode4: id=1756954634 state=lost (new) addr=r(0) ip(10.0.185.104)
votes=1 born=196 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
ais_mark_unseen_peer_dead: Node tstnode7 was not seen in the previous
transition
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info: update_member:
Node 1807286282/tstnode7 is now: lost
Jan 15 13:19:28 tstnode5 corosync[2484]: [pcmk ] info:
send_member_notification: Sending membership update 212 to 2 children
Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node
tstnode6: id=1790509066 state=lost (new) addr=r(0) ip(10.0.185.106)
votes=1 born=196 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:28 tstnode5 corosync[2484]: [TOTEM ] A processor joined
or left the membership and a new membership was formed.
Jan 15 13:19:28 tstnode5 corosync[2484]: [CPG ] chosen downlist from
node r(0) ip(10.0.185.105)
Jan 15 13:19:28 tstnode5 corosync[2484]: [MAIN ] Completed service
synchronization, ready to provide service.
Jan 15 13:19:28 tstnode5 cib: [2493]: info: crm_update_peer: Node
tstnode7: id=1807286282 state=lost (new) addr=r(0) ip(10.0.185.107)
votes=1 born=204 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:29 tstnode5 crmd: [2497]: info: te_connect_stonith: Connected
Jan 15 13:19:29 tstnode5 crmd: [2497]: notice: ais_dispatch_message:
Membership 212: quorum lost
Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode4 is now lost (was member)
Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node
tstnode4: id=1756954634 state=lost (new) addr=r(0) ip(10.0.185.104)
votes=1 born=196 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode6 is now lost (was member)
Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node
tstnode6: id=1790509066 state=lost (new) addr=r(0) ip(10.0.185.106)
votes=1 born=196 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:29 tstnode5 crmd: [2497]: info: ais_status_callback:
status: tstnode7 is now lost (was member)
Jan 15 13:19:29 tstnode5 crmd: [2497]: info: crm_update_peer: Node
tstnode7: id=1807286282 state=lost (new) addr=r(0) ip(10.0.185.107)
votes=1 born=204 seen=208 proc=00000000000000000000000000013312
Jan 15 13:19:31 tstnode5 attrd: [2495]: info: cib_connect: Connected to
the CIB after 1 signon attempts
Jan 15 13:19:31 tstnode5 attrd: [2495]: info: cib_connect: Sending full
refresh
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crm_timer_popped: Election
Trigger (I_DC_TIMEOUT) just popped! (60000ms)
Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: do_log: FSA: Input
I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State
transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
cause=C_TIMER_POPPED origin=crm_timer_popped ]
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State
transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC
cause=C_FSA_INTERNAL origin=do_election_check ]
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_te_control: Registering
TE UUID: fef69ffc-bef2-475d-b3c4-ddaa94ab8af6
Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN:
cib_client_add_notify_callback: Callback already present
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: set_graph_functions:
Setting custom graph functions
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: unpack_graph: Unpacked
transition -1: 0 actions in 0 synapses
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_takeover: Taking over
DC status for this partition
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_readwrite: We
are now in R/W mode
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_master for section 'all'
(origin=local/crmd/5, version=0.12.1): ok (rc=0)
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_modify for section cib (origin=local/crmd/6,
version=0.12.2): ok (rc=0)
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_modify for section crm_config
(origin=local/crmd/8, version=0.12.3): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: join_make_offer: Making
join offers based on membership 212
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_offer_all:
join-1: Waiting on 1 outstanding join acks
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_modify for section crm_config
(origin=local/crmd/10, version=0.12.4): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback:
Shutdown escalation occurs after: 1200000ms
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback:
Checking for expired actions every 900000ms
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: config_query_callback:
Sending expected-votes=4 to corosync
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: ais_dispatch_message:
Membership 212: quorum still lost
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crmd_ais_dispatch: Setting
expected votes to 4
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: update_dc: Set DC to
tstnode5 (3.0.5)
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_modify for section crm_config
(origin=local/crmd/13, version=0.12.5): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: ais_dispatch_message:
Membership 212: quorum still lost
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crmd_ais_dispatch: Setting
expected votes to 4
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_modify for section crm_config
(origin=local/crmd/16, version=0.12.6): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State
transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED
cause=C_FSA_INTERNAL origin=check_join_state ]
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: All 1
cluster nodes responded to the join offer.
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_finalize:
join-1: Syncing the CIB from tstnode5 to the rest of the cluster
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_sync for section 'all' (origin=local/crmd/17,
version=0.12.6): ok (rc=0)
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_modify for section nodes
(origin=local/crmd/18, version=0.12.7): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: update_attrd: Connecting to
attrd...
Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating
hash entry for terminate
Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating
hash entry for shutdown
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_delete for section
//node_state[@uname='tstnode5']/transient_attributes
(origin=local/crmd/19, version=0.12.8): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: erase_xpath_callback:
Deletion of "//node_state[@uname='tstnode5']/transient_attributes": ok
(rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_ack: join-1:
Updating node state to member for tstnode5
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_delete for section
//node_state[@uname='tstnode5']/lrm (origin=local/crmd/20,
version=0.12.9): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: erase_xpath_callback:
Deletion of "//node_state[@uname='tstnode5']/lrm": ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State
transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED
cause=C_FSA_INTERNAL origin=check_join_state ]
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: All 1
cluster nodes are eligible to run resources.
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_dc_join_final: Ensuring
DC, quorum and node attributes are up-to-date
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: crm_update_quorum: Updating
quorum status to false (call=24)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph:
do_te_invoke:173 - Triggered transition abort (complete=1) : Peer Cancelled
Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_local_callback:
Sending full refresh (origin=crmd)
Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update:
Sending flush op to all hosts for: terminate (<null>)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 25:
Requesting the current CIB: S_POLICY_ENGINE
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_modify for section nodes
(origin=local/crmd/22, version=0.12.11): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match
for shutdown action on tstnode4
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff:
Stonith/shutdown of tstnode4 not matched
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph:
te_update_diff:200 - Triggered transition abort (complete=1,
tag=node_state, id=tstnode4, magic=NA, cib=0.12.12) : Node failure
Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match
for shutdown action on tstnode6
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff:
Stonith/shutdown of tstnode6 not matched
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph:
te_update_diff:200 - Triggered transition abort (complete=1,
tag=node_state, id=tstnode6, magic=NA, cib=0.12.12) : Node failure
Jan 15 13:20:28 tstnode5 crmd: [2497]: WARN: match_down_event: No match
for shutdown action on tstnode7
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_update_diff:
Stonith/shutdown of tstnode7 not matched
Jan 15 13:20:28 tstnode5 cib: [2493]: info: cib_process_request:
Operation complete: op cib_modify for section cib (origin=local/crmd/24,
version=0.12.13): ok (rc=0)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: abort_transition_graph:
te_update_diff:200 - Triggered transition abort (complete=1,
tag=node_state, id=tstnode7, magic=NA, cib=0.12.12) : Node failure
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 26:
Requesting the current CIB: S_POLICY_ENGINE
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 27:
Requesting the current CIB: S_POLICY_ENGINE
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke: Query 28:
Requesting the current CIB: S_POLICY_ENGINE
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_pe_invoke_callback:
Invoking the PE: query=28, ref=pe_calc-dc-1358256028-8, seq=212, quorate=0
Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update:
Sending flush op to all hosts for: shutdown (<null>)
Jan 15 13:20:28 tstnode5 pengine: [2496]: WARN: cluster_status: We do
not have quorum - fencing and resource management disabled
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: unpack_graph: Unpacked
transition 0: 1 actions in 1 synapses
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_te_invoke: Processing
graph 0 (ref=pe_calc-dc-1358256028-8) derived from
/var/lib/pengine/pe-input-30.bz2
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_rsc_command: Initiating
action 2: probe_complete probe_complete on tstnode5 (local) - no waiting
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: run_graph:
====================================================
Jan 15 13:20:28 tstnode5 crmd: [2497]: notice: run_graph: Transition 0
(Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-30.bz2): Complete
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: te_graph_trigger:
Transition 0 is now complete
Jan 15 13:20:28 tstnode5 attrd: [2495]: info: find_hash_entry: Creating
hash entry for probe_complete
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: notify_crmd: Transition 0
status: done - <null>
Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_trigger_update:
Sending flush op to all hosts for: probe_complete (true)
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Jan 15 13:20:28 tstnode5 crmd: [2497]: info: do_state_transition:
Starting PEngine Recheck Timer
Jan 15 13:20:28 tstnode5 attrd: [2495]: info: attrd_perform_update: Sent
update 8: probe_complete=true
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss