Re: Trouble with active/active

Nick Khamis <symack@xxxxxxxxx> · Wed, 2 Nov 2011 10:30:50 -0400

Hello Andrew,

I do appologize for that. When we found out that cman needed to
included for dlm support, it
kind of got us scrambelling here. Currently we are experiencing a
problem with o2cb primitive,
and ocfs2_cotrold.pcmk. Who's error I am reluctant to post here
because I have already created
an email for it.

Nick.

On Wed, Nov 2, 2011 at 10:22 AM, Andrew Beekhof <andrew@xxxxxxxxxxx> wrote:
> Please avoid starting a new thread every 5s.  Its way too much noise.
> Pick one software stack and concentrate on getting that working rather
> than attempting them all at once.
>
> On Tue, Nov 1, 2011 at 8:15 AM, Nick Khamis <symack@xxxxxxxxx> wrote:
>> Hello Everyone,
>>
>> I have the following built from source:
>>
>> Corosync 1.4.2
>> Pacemaker 1.1.6
>> Cman 3.1.7
>>
>> Corosync, with service.d/pcmk works fine pcmk crm is started etc.. I
>> have an existing
>> cib configuration as shown bellow, and the RAs load fine.
>>
>>
>> <corosync.conf>
>>
>> totem {
>>
>>        version: 2
>>
>>        # How long before declaring a token lost (ms)
>>        token:          5000
>>
>>        # How many token retransmits before forming a new configuration
>>        token_retransmits_before_loss_const: 20
>>
>>        # How long to wait for join messages in the membership protocol (ms)
>>        join:           1000
>>
>>        # How long to wait for consensus to be achieved before starting a
>> new round of membership configuration (ms)
>>        consensus:      7500
>>
>>        # Turn off the virtual synchrony filter
>>        vsftype:        none
>>
>>        # Number of messages that may be sent by one processor on receipt of the token
>>        max_messages:   20
>>
>>        # Disable encryption
>>        secauth:        off
>>
>>        # How many threads to use for encryption/decryption
>>        threads:        0
>>
>>        # Limit generated nodeids to 31-bits (positive signed integers)
>>        clear_node_high_bit: yes
>>
>>        # Optionally assign a fixed node id (integer)
>>        nodeid:         4
>>
>>        interface {
>>                ringnumber: 0
>>
>>                # The following three values need to be set based on your environment
>>                bindnetaddr: 192.168.2.0
>>                mcastaddr: 226.94.1.1
>>                mcastport: 5405
>>        }
>>  }
>>
>> amf {
>>        mode: disabled
>> }
>>
>>
>> <cib conf>
>>
>> node astdrbd1 \
>>       attributes standby="off"
>> node astdrbd2 \
>>       attributes standby="off"
>> primitive astIP ocf:heartbeat:IPaddr2 \
>>        op monitor interval="60" timeout="20" \
>>        params ip="192.168.2.6" cidr_netmask="24" \
>>        nic="eth2" broadcast="192.168.2.255" \
>>        lvs_support="true"
>> primitive astDRBD ocf:linbit:drbd \
>>        params drbd_resource="r0.res" \
>>        op monitor role=Master interval="20" timeout="20"\
>>        op monitor role=Slave interval="30" timeout="20"
>> ms msAstDRBD astDRBD \
>>        meta master-max="2" clone-max=2 interleave="true" \
>>        notify="true" globally-unique="false"
>> primitive astDLM ocf:pacemaker:controld \
>>        op monitor interval="120s"
>> primitive astO2CB ocf:pacemaker:o2cb op monitor interval="120s"
>> primitive astFilesystem ocf:heartbeat:Filesystem \
>>        params device="/dev/drbd0" directory="/service" fstype="ocfs2" \
>>        op monitor interval="120" \
>>        meta target-role="Started"
>> order astDrbdAfterIP \
>>        inf: astIP msAstDRBD
>> order dlmAfterDRBD \
>>        inf: msAstDRBD:promote astDLM:start
>> order o2cbAfterDLM \
>>        inf:  astDLM:promote astO2CB:start
>> order astFilesystemAfterO2cb \
>>        inf: astO2CB:promote astFilesystem:start
>> colocation astDrbdOnIP \
>>        inf: msAstDRBD:Master astIP
>> colocation dlmOnDRBD \
>>        inf: astDLM msAstDRBD:Master
>> colocation o2cbOnDLM \
>>        inf: astO2CB astDLM:Master
>> colocation astFilesystemOnO2CB \
>>        inf: astFilesystem astO2CB:Master
>> location prefer-ast1 astIP inf: astdrbd1
>> location prefer-ast2 astIP inf: astdrbd2
>> property $id="cib-bootstrap-options" \
>>        no-quorum-policy="ignore" \
>>        stonith-enabled="false" \
>>        expected-quorum-votes="5" \
>>        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>>        cluster-recheck-interval="0" \
>>        cluster-infrastructure="openais"
>>        rsc_defaults $id="rsc-options" \
>>        resource-stickiness="100"
>>
>>
>> Add cman for active/active support into the formula, and I am not sure
>> how the whole thing
>> should start spinning
>>
>>
>> <cluster.conf>
>>
>> <?xml version="1.0"?>
>> <cluster name="ASTCluster" config_version="3">
>> <logging debug="off"/>
>> <cman expected_votes="1" two_node="1"/>
>> <clusternodes>
>> <clusternode name="astdrbd1" nodeid="1">
>> <fence>
>> <method name="pcmk-redirect">
>> <device name="pcmk" port="astdrbd1"/>
>> </method>
>> </fence>
>> </clusternode>
>> <clusternode name="astdrbd2" nodeid="2">
>> <fence>
>> <method name="pcmk-redirect">
>> <device name="pcmk" port="astdrbd2"/>
>> </method>
>> </fence>
>> </clusternode>
>> </clusternodes>
>> <fencedevices>
>> <fencedevice agent="fence_pcmk" name="pcmk"/>
>> </fencedevices>
>> </cluster>
>>
>> /etc/cororosync/service.d/pcmk renamed pcmk.bak
>>
>> Starting cman works fine
>>
>> When trying to start pacemaker I get the following:
>>
>> /etc/init.d/pacemaker start
>>
>> Oct 27 15:41:54 astdrbd1 pacemakerd: [18628]: info:
>> crm_log_init_worker: Changed active directory to
>> /usr/var/lib/heartbeat/cores/root
>> Oct 27 15:42:07 astdrbd1 pacemakerd: [18630]: info: Invoked: pacemakerd -$
>> Oct 27 15:42:07 astdrbd1 pacemakerd: [18630]: info:
>> crm_log_init_worker: Changed active directory to
>> /usr/var/lib/heartbeat/cores/root
>> Oct 27 16:17:01 astdrbd1 /USR/SBIN/CRON[30164]: (root) CMD (   cd / &&
>> run-parts --report /etc/cron.hourly)
>> Oct 27 17:01:16 astdrbd1 udevd-work[4484]: kernel-provided name
>> 'ocfs2_control' and NAME= 'misc/ocfs2_control' disagree, please use
>> SYMLINK+= or change the kernel to provide the proper name
>> Oct 27 17:01:16 astdrbd1 kernel: [26174.953112] ocfs2: Registered
>> cluster interface user
>> Oct 27 17:01:16 astdrbd1 kernel: [26175.082045] OCFS2 Node Manager 1.5.0
>> Oct 27 17:01:16 astdrbd1 kernel: [26175.252185] OCFS2 1.5.0
>> Oct 27 17:01:17 astdrbd1 ocfs2_controld: [4497]: info:
>> get_cluster_type: Assuming a 'heartbeat' based cluster
>> Oct 27 17:01:17 astdrbd1 ocfs2_controld: [4497]: CRIT:
>> get_cluster_type: This installation of Pacemaker does not support the
>> 'heartbeat' cluster infrastructure.  Terminating.
>>
>> I never installed heartbeat? I was never quite sure why
>> /var/lib/heartbeat existed in the first place?
>>
>> pacemakerd -v
>>
>> pacemakerd[2038]: 2011/10/31_16:43:47 info: config_find_next:
>> Processing additional service options...
>> pacemakerd[2038]: 2011/10/31_16:43:47 info: get_config_opt: Found
>> 'pacemaker' for option: name
>> pacemakerd[2038]: 2011/10/31_16:43:47 info: get_config_opt: Found '0'
>> for option: ver
>> pacemakerd[2038]: 2011/10/31_16:43:47 info: get_cluster_type: Detected
>> an active 'classic openais (with plugin)' cluster
>> pacemakerd[2038]: 2011/10/31_16:43:47 info: read_config: Reading
>> configure for stack: classic openais (with plugin)
>> pacemakerd[2038]: 2011/10/31_16:43:47 info: config_find_next:
>> Processing additional service options...
>> pacemakerd[2038]: 2011/10/31_16:43:47 info: get_config_opt: Found
>> 'pacemaker' for option: name
>> pacemakerd[2038]: 2011/10/31_16:43:47 info: get_config_opt: Found '0'
>> for option: ver
>> pacemakerd[2038]: 2011/10/31_16:43:47 ERROR: read_config: We can only
>> start Pacemaker from init if using version 1 of the Pacemaker plugin
>> for Corosync.
>>
>> Can someone please help me understand what is goiing on here. Because
>> at this point there are:
>>
>> Two Cluster Managers (Pacemaker, CMAN)
>> Two Messaging Layers (Corosync, OpenAIS), and for some
>> reason some Heartbeat material?
>> I am not sure but I think I have 2 RA as well (Cluster Labs RA, CMAN RA)?
>>
>> Please Help,
>>
>> Nick.
>> _______________________________________________
>> discuss mailing list
>> discuss@xxxxxxxxxxxx
>> http://lists.corosync.org/mailman/listinfo/discuss
>>
>
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss