Hi, On Fri, Dec 23, 2011 at 12:23 PM, Anthony BRODARD <brodard.anthony@xxxxxxxxx> wrote: > Hi list, > > I'm trying to configure corosync + DRBD on 2 servers, bart and lisa. > DRBD works fine, no problem. > But for corosync, I have a problem with resources' configuration. DRBD is > correctly managed, I can move it on each server without any problem. > But other resources (vip, apache, mysql and filesystem for drbd), which are > included in a group, refuse to start, and are not displayed in the command > "crm_mon" : > > ============ > Last updated: Fri Dec 23 11:10:54 2011 > Stack: openais > Current DC: bart - partition with quorum > Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b > 2 Nodes configured, 2 expected votes > 2 Resources configured. > ============ > > Online: [ bart lisa ] > > Master/Slave Set: ms-drbd-rt > Masters: [ bart ] > Slaves: [ lisa ] > > > First, the configuration : > > [bart]~ # crm configure show > node bart > node lisa > primitive fs-data ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/data" directory="/data/" > fstype="ext3" Needs monitor operation. > primitive rt-apache2 ocf:heartbeat:apache \ > params configfile="/etc/apache2/apache2.conf" port="443" \ > op monitor interval="10" timeout="20s" depth="0" \ > op stop interval="0" timeout="40" \ > op start interval="0" timeout="60" \ > meta is-managed="false" Need to remove is-managed="false" > primitive rt-drbd ocf:linbit:drbd \ > params drbd_resource="data" \ > op monitor interval="15s" ignore_deprecation="true" \ > op stop interval="0" timeout="100" \ > op start interval="0" timeout="240" Need to specify 2 monitor operations, one for role=Master one for role=Slave with different intervals, a bit higher in favor of the Master. > primitive rt-mysql lsb:mysql I strongly recommend using an OCF RA, not the init script. Specifically this one https://github.com/fghaas/resource-agents/blob/master/heartbeat/mysql > primitive rt-vip ocf:heartbeat:IPaddr2 \ > params ip="10.1.150.150" cidr_netmask="32" \ > op monitor interval="5s" > group rt-grp rt-apache2 fs-data rt-mysql rt-vip \ > meta target-role="Started" > ms ms-drbd-rt rt-drbd \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" > location prefer-rt-bart rt-grp 1: bart > colocation rt-on-drbd inf: rt-grp ms-drbd-rt:Master > order drbd-before-rt inf: ms-drbd-rt:promote rt-grp:start > property $id="cib-bootstrap-options" \ > dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > default-resource-stickiness="100" \ > start-failure-is-fatal="false" Remove start-failure-is-fatal, if you don't know what it does, don't enable it. > > > When I try "crm resource restart rt-grp", syslog says: > > Dec 23 11:16:25 bart cibadmin: [4271]: info: Invoked: cibadmin -Ql -o > resources > Dec 23 11:16:25 bart cibadmin: [4273]: info: Invoked: cibadmin -p -R -o > resources > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - <cib > admin_epoch="0" epoch="88" num_updates="1" > > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > <configuration > > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > <resources > > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > <group id="rt-grp" > > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > <meta_attributes id="rt-grp-meta_attributes" > > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > <nvpair value="Stopped" id="rt-grp-meta_attributes-target-role" /> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > </meta_attributes> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > </group> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > </resources> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - > </configuration> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: - </cib> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + <cib > admin_epoch="0" epoch="89" num_updates="1" > > Dec 23 11:16:25 bart crmd: [3315]: info: abort_transition_graph: > need_abort:59 - Triggered transition abort (complete=1) : Non-status change > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > <configuration > > Dec 23 11:16:25 bart crmd: [3315]: info: need_abort: Aborting on change to > admin_epoch > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > <resources > > Dec 23 11:16:25 bart crmd: [3315]: info: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL > origin=abort_transition_graph ] > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > <group id="rt-grp" > > Dec 23 11:16:25 bart crmd: [3315]: info: do_state_transition: All 2 cluster > nodes are eligible to run resources. > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > <meta_attributes id="rt-grp-meta_attributes" > > Dec 23 11:16:25 bart crmd: [3315]: info: do_pe_invoke: Query 116: Requesting > the current CIB: S_POLICY_ENGINE > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > <nvpair value="Started" id="rt-grp-meta_attributes-target-role" /> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > </meta_attributes> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > </group> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > </resources> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + > </configuration> > Dec 23 11:16:25 bart cib: [3311]: info: log_data_element: cib:diff: + </cib> > Dec 23 11:16:25 bart cib: [3311]: info: cib_process_request: Operation > complete: op cib_replace for section resources (origin=local/cibadmin/2, > version=0.89.1): ok (rc=0) > Dec 23 11:16:26 bart crmd: [3315]: info: do_pe_invoke_callback: Invoking the > PE: query=116, ref=pe_calc-dc-1324635386-87, seq=96, quorate=1 > Dec 23 11:16:26 bart pengine: [3314]: notice: unpack_config: On loss of CCM > Quorum: Ignore > Dec 23 11:16:26 bart pengine: [3314]: info: unpack_config: Node scores: > 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 > Dec 23 11:16:26 bart cib: [4274]: info: write_cib_contents: Archived > previous version as /var/lib/heartbeat/crm/cib-63.raw > Dec 23 11:16:26 bart pengine: [3314]: info: determine_online_status: Node > bart is online > Dec 23 11:16:26 bart pengine: [3314]: info: determine_online_status: Node > lisa is online > Dec 23 11:16:26 bart pengine: [3314]: notice: unpack_rsc_op: Operation > rt-drbd:1_monitor_0 found resource rt-drbd:1 active on lisa > Dec 23 11:16:26 bart pengine: [3314]: notice: group_print: Resource Group: > rt-grp > Dec 23 11:16:26 bart pengine: [3314]: notice: native_print: rt-apache2 > (ocf::heartbeat:apache): Stopped (unmanaged) > Dec 23 11:16:26 bart pengine: [3314]: notice: native_print: fs-data > (ocf::heartbeat:Filesystem): Stopped > Dec 23 11:16:26 bart pengine: [3314]: notice: native_print: rt-mysql > (lsb:mysql): Stopped > Dec 23 11:16:26 bart pengine: [3314]: notice: native_print: rt-vip > (ocf::heartbeat:IPaddr2): Stopped > Dec 23 11:16:26 bart pengine: [3314]: notice: clone_print: Master/Slave > Set: ms-drbd-rt > Dec 23 11:16:26 bart pengine: [3314]: notice: short_print: Masters: [ > bart ] > Dec 23 11:16:26 bart cib: [4274]: info: write_cib_contents: Wrote version > 0.89.0 of the CIB to disk (digest: cd534ad8c2b3c1f8add883e157966248) > Dec 23 11:16:26 bart pengine: [3314]: notice: short_print: Slaves: [ > lisa ] > Dec 23 11:16:26 bart pengine: [3314]: info: master_color: Promoting > rt-drbd:0 (Master bart) > Dec 23 11:16:26 bart pengine: [3314]: info: master_color: ms-drbd-rt: > Promoted 1 instances of a possible 1 to master > Dec 23 11:16:26 bart pengine: [3314]: info: native_color: Unmanaged resource > rt-apache2 allocated to 'nowhere': inactive > Dec 23 11:16:26 bart pengine: [3314]: info: native_merge_weights: fs-data: > Rolling back scores from rt-mysql > Dec 23 11:16:26 bart cib: [4274]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.f8vE7m (digest: > /var/lib/heartbeat/crm/cib.JaAGaL) > Dec 23 11:16:26 bart pengine: [3314]: info: native_color: Resource fs-data > cannot run anywhere > Dec 23 11:16:26 bart pengine: [3314]: info: native_merge_weights: rt-mysql: > Rolling back scores from rt-vip > Dec 23 11:16:26 bart pengine: [3314]: info: native_color: Resource rt-mysql > cannot run anywhere > Dec 23 11:16:26 bart pengine: [3314]: info: native_color: Resource rt-vip > cannot run anywhere > Dec 23 11:16:26 bart pengine: [3314]: info: master_color: Promoting > rt-drbd:0 (Master bart) > Dec 23 11:16:26 bart pengine: [3314]: info: master_color: ms-drbd-rt: > Promoted 1 instances of a possible 1 to master > Dec 23 11:16:26 bart pengine: [3314]: ERROR: create_notification_boundaries: > Creating boundaries for ms-drbd-rt > Dec 23 11:16:26 bart pengine: [3314]: ERROR: create_notification_boundaries: > Creating boundaries for ms-drbd-rt > Dec 23 11:16:26 bart pengine: [3314]: ERROR: create_notification_boundaries: > Creating boundaries for ms-drbd-rt > Dec 23 11:16:26 bart pengine: [3314]: ERROR: create_notification_boundaries: > Creating boundaries for ms-drbd-rt > Dec 23 11:16:26 bart pengine: [3314]: notice: LogActions: Leave resource > rt-apache2 (Stopped unmanaged) > Dec 23 11:16:26 bart pengine: [3314]: notice: LogActions: Leave resource > fs-data (Stopped) > Dec 23 11:16:26 bart pengine: [3314]: notice: LogActions: Leave resource > rt-mysql (Stopped) > Dec 23 11:16:26 bart pengine: [3314]: notice: LogActions: Leave resource > rt-vip (Stopped) > Dec 23 11:16:26 bart pengine: [3314]: notice: LogActions: Leave resource > rt-drbd:0 (Master bart) > Dec 23 11:16:26 bart pengine: [3314]: notice: LogActions: Leave resource > rt-drbd:1 (Slave lisa) > Dec 23 11:16:26 bart crmd: [3315]: info: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Dec 23 11:16:26 bart crmd: [3315]: info: unpack_graph: Unpacked transition > 17: 0 actions in 0 synapses > Dec 23 11:16:26 bart crmd: [3315]: info: do_te_invoke: Processing graph 17 > (ref=pe_calc-dc-1324635386-87) derived from > /var/lib/pengine/pe-input-1084.bz2 > Dec 23 11:16:26 bart pengine: [3314]: info: process_pe_message: Transition > 17: PEngine Input stored in: /var/lib/pengine/pe-input-1084.bz2 > Dec 23 11:16:26 bart crmd: [3315]: info: run_graph: > ==================================================== > Dec 23 11:16:26 bart crmd: [3315]: notice: run_graph: Transition 17 > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-1084.bz2): Complete > Dec 23 11:16:26 bart crmd: [3315]: info: te_graph_trigger: Transition 17 is > now complete > Dec 23 11:16:26 bart crmd: [3315]: info: notify_crmd: Transition 17 status: > done - <null> > Dec 23 11:16:26 bart crmd: [3315]: info: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Dec 23 11:16:26 bart crmd: [3315]: info: do_state_transition: Starting > PEngine Recheck Timer > > > I don't understand what is wrong in my configuration. Did you have any idea? Additional documentation available at: http://www.clusterlabs.org/wiki/DRBD_MySQL_HowTo http://www.drbd.org/users-guide-8.3/s-pacemaker-crm-drbd-backed-service.html http://www.linbit.com/en/education/tech-guides/mysql-high-availability-on-the-pacemaker-cluster-stack/ http://www.hastexo.com/content/mysql-high-availability-sprint-launch-pacemaker p.s.: the mailing list changed, this is the new one that should be used. HTH, Dan > > Regards > Anthony > > > > _______________________________________________ > Openais mailing list > Openais@xxxxxxxxxxxxxxxxxxxxxxxxxx > https://lists.linuxfoundation.org/mailman/listinfo/openais -- Dan Frincu CCNA, RHCE _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss