Re: Node can't join already quorated cluster‏

Digimer <lists@xxxxxxxxxx> · Wed, 20 Jun 2012 18:13:47 -0400

You won't see services until the rgmanager daemon is running. Look at this:

> node1-hb                                  1 Online
> node2-hb                               2 Online, Local, rgmanager

This tells you that both node1-hb and node2-hb are running CMAN (That's 
the "Online" part), but only node2-hb is running "rgmanager". So on 
node1-hb, run '/etc/init.d/rgmanager start'.

As for the fence requirement, I agree that it should be said more 
directly, but it is covered here:

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/High_Availability_Add-On_Overview/ch-fencing.html

Specifically:

    For example, DLM and GFS2, when notified of a node failure, suspend
    activity until they detect that fenced has completed fencing the
    failed node. Upon confirmation that the failed node is fenced, DLM
    and GFS2 perform recovery. DLM releases locks of the failed node;
    GFS2 recovers the journal of the failed node.

The key is "Upon confirmation that the failed node is fenced, DLM and 
GFS2 perform recovery."

If there is no fence configured, they will never get confirmation of 
success, so the cluster stays blocked (effectively hung forever, by 
design).

This requirement is also documented on the official cluster wiki:

https://fedorahosted.org/cluster/wiki/FAQ/Fencing#fence_manual2

Digimer

On 06/20/2012 05:54 PM, Javier Vela wrote:
Hi, Finally I solved the problem. First, the qdisk over lvm does not
work very well. Switching to a plain device works better. And, as you
stated, without a fence device is not possible to get the cluster to
work well, so I`m going to push VMWare admins and use vmware fencing.

I'm very grateful, I've been working on this problem 3 days without
understanding what was happening, and with only a few emails the problem
is solved. The only thing that bothers me is why cman doesn't advise you
that without a proper fencing the cluster won't work. Moreover I haven't
found in the Red Hat documentatión a statement telling what I've readed
in the link you pasted:

    Fencing is a absolutely critical part of clustering. Without fully
    working fence devices, your cluster will fail.

I'm a bit sorry, but now I have another problem. With the cluster
quorate and the two nodes online + qdisk, when I start rgmanager on one
node, everything works ok, an the service starts. Then I start rgmanager
in the other node, but in the second node clustat doesn't show the service:

node2 (with the service working):

[root@node2 ~]# clustat
Cluster Status for test_cluster @ Wed Jun 20 16:21:19 2012
Member Status: Quorate

  Member Name                             ID   Status
  ------ ----                             ---- ------
node1-hb                                  1 Online
node2-hb                               2 Online, Local, rgmanager
  /dev/disk/by-path/pci-0000:02:01.0-scsi-    0 Online, Quorum Disk

  Service Name                   Owner (Last)                   State
  ------- ----                   ----- ------                   -----
  service:postgres                  node2-hb                  started

node1 (doesn't see the service)

[root@node1 ~]# clustat
Cluster Status for test_cluster @ Wed Jun 20 16:21:15 2012
Member Status: Quorate

  Member Name                             ID   Status
  ------ ----                             ---- ------
  node1-hb                                  1 Online, Local
  node2-hb                               2 Online
  /dev/disk/by-path/pci-0000:02:01.0-scsi-    0 Online, Quorum Disk

In the /var/log/messages  I don't see errors, only this:
last message repeated X times

¿What I'm missing? As far I can see, rgmanager doesn't appear in node1, but:

[root@node1 ~]# service rgmanager status
Se estÃ¡ ejecutando clurgmgrd (pid  8254)...

The cluster conf:

<?xml version="1.0"?>
<cluster alias="test_cluster" config_version="15" name="test_cluster">
         <fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="6"/>
         <clusternodes>
                 <clusternode name="node1-hb" nodeid="1" votes="1">
                        <fence>
                                 <method name="manual">
                                         <device name="fence_manual"
nodename="node1-hb"/>
                                 </method>
                         </fence>
                 </clusternode>
                 <clusternode name="node2-hb" nodeid="2" votes="1">
                         <fence>
                                 <method name="manual">
                                         <device name="fence_manual"
nodename="node2-hb"/>
                                 </method>
                         </fence>
                 </clusternode>
         </clusternodes>
         <cman two_node="0" expected_votes="3"/>
         <fencedevices>
                 <fencedevice agent="fence_manual" name="fence_manual"/>
         </fencedevices>

         <rm log_facility="local4" log_level="7">
                 <failoverdomains>
                         <failoverdomain name="etest_cluster_fo"
nofailback="1" ordered="1" restricted="1">

       <failoverdomainnode name="node1-hb" priority="1"/>
                                     <failoverdomainnode name="node2-hb"
    priority="2"/>
                             </failoverdomain>
                     </failoverdomains>
             <resources/>
             <service autostart="1" domain="test_cluster_fo"
    exclusive="0" name="postgres" recovery="relocate">
                     <ip address="172.24.119.44" monitor_link="1"/>
                     <lvm name="vg_postgres" vg_name="vg_postgres"
    lv_name="postgres"/>

                     <fs device="/dev/vg_postgres/postgres"
    force_fsck="1" force_unmount="1" fstype="ext3"
    mountpoint="/var/lib/pgsql" name="postgres" self_fence="0"/>

                     <script file="/etc/init.d/postgresql" name="postgres">
                     </script>
             </service>
             </rm>
             <totem consensus="4000" join="60" token="20000"
    token_retransmits_before_loss_const="20"/>
         <quorumd  interval="1" label="cluster_qdisk" tko="10" votes="1">
                     <heuristic
    program="/usr/share/cluster/check_eth_link.sh eth0" score="1"
    interval="2" tko="3"/>
             </quorumd>
      </cluster>

Regards, Javi.

2012/6/20 Digimer <lists@xxxxxxxxxx <mailto:lists@xxxxxxxxxx>>

    It's worth re-stating;

    You are running an unsupported configuration. Please try to have the
    VMWare admins enable fence calls against your nodes and setup
    fencing. Until and unless you do, you will almost certainly run into
    problems, up to and including corrupting your data.

    Please take a minute to read this:

    https://alteeve.com/w/2-Node___Red_Hat_KVM_Cluster_Tutorial#__Concept.3B_Fencing
    <https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Concept.3B_Fencing>

    Digimer

    On 06/20/2012 11:22 AM, emmanuel segura wrote:

        Ok Javier

        So now i know you don't wanna the fencing and the reason :-)

        <fence_daemon clean_start="1" post_fail_delay="0"
        post_join_delay="-1"/>

        and use the fence_manual

        2012/6/20 Javier Vela <jvdiago@xxxxxxxxx
        <mailto:jvdiago@xxxxxxxxx> <mailto:jvdiago@xxxxxxxxx
        <mailto:jvdiago@xxxxxxxxx>>>

            I don't use fencing because with ha-lvm I thought that I
        dind't need
            it. But also because both nodes are VMs in VMWare. I know
        that there
            is a module to do fencing with vmware but I prefer to avoid
        it. I'm
            not in control of the VMWare infraestructure and probably VMWare
            admins won't give me the tools to use this module.

            Regards, Javi

                Fencing is critical, and running a cluster without
        fencing, even with

                qdisk, is not supported. Manual fencing is also not
        supported. The
                *only* way to have a reliable cluster, testing or
        production, is to use
                fencing.

                Why do you not wish to use it?

                On 06/20/2012 09:43 AM, Javier Vela wrote:

                > As I readed, if you use HA-LVM you don't need fencing
        because of vg
                > tagging. Is It absolutely mandatory to use fencing
        with qdisk?
                >
                > If it is, i supose i can use manual_fence, but in
        production I also

                > won't use fencing.
                >
                > Regards, Javi.
                >
                > Date: Wed, 20 Jun 2012 14:45:28 +0200
                > From:emi2fast@xxxxxxxxx
        <mailto:From%3Aemi2fast@xxxxxxxxx>  <mailto:emi2fast@xxxxxxxxx
        <mailto:emi2fast@xxxxxxxxx>>  <mailto:emi2fast@xxxxxxxxx
        <mailto:emi2fast@xxxxxxxxx>  <mailto:emi2fast@xxxxxxxxx
        <mailto:emi2fast@xxxxxxxxx>>>

                > To:linux-cluster@xxxxxxxxxx
        <mailto:To%3Alinux-cluster@xxxxxxxxxx>
          <mailto:linux-cluster@redhat.__com
        <mailto:linux-cluster@xxxxxxxxxx>>
          <mailto:linux-cluster@redhat.__com
        <mailto:linux-cluster@xxxxxxxxxx>
          <mailto:linux-cluster@redhat.__com
        <mailto:linux-cluster@xxxxxxxxxx>>>

                > Subject: Re:  Node can't join already
        quorated cluster

                >
                > If you don't wanna use a real fence divice, because
        you only do some
                > test, you have to use fence_manual agent
                >
                > 2012/6/20 Javier Vela <jvdiago@xxxxxxxxx
        <mailto:jvdiago@xxxxxxxxx>  <mailto:jvdiago@xxxxxxxxx
        <mailto:jvdiago@xxxxxxxxx>>  <mailto:jvdiago@xxxxxxxxx
        <mailto:jvdiago@xxxxxxxxx>  <mailto:jvdiago@xxxxxxxxx
        <mailto:jvdiago@xxxxxxxxx>>>>

                >
                >     Hi, I have a very strange problem, and after
        searching through lot
                >     of forums, I haven't found the solution. This is
        the scenario:
                >
                >     Two node cluster with Red Hat 5.7, HA-LVM, no
        fencing and quorum

                >     disk. I start qdiskd, cman and rgmanager on one
        node. After 5
                >     minutes, finally the fencing finishes and cluster
        get quorate with 2
                >     votes:
                >
                >     [root@node2 ~]# clustat
                >     Cluster Status for test_cluster @ Wed Jun 20
        05:56:39 2012

                >     Member Status: Quorate
                >
                >       Member Name                             ID   Status
                >       ------ ----                             ---- ------
                >       node1-hb                                  1 Offline

                >       node2-hb                               2 Online,
        Local, rgmanager
                >       /dev/mapper/vg_qdisk-lv_qdisk               0
        Online, Quorum Disk
                >
                >       Service Name                   Owner (Last)
                       State

                >       ------- ----                   ----- ------
                       -----
                >       service:postgres                   node2
                  started
                >
                >     Now, I start the second node. When cman reaches
        fencing, it hangs

                >     for 5 minutes aprox, and finally fails. clustat says:
                >
                >     root@node1 ~]# clustat
                >     Cluster Status for test_cluster @ Wed Jun 20
        06:01:12 2012
                >     Member Status: Inquorate
                >

                >       Member Name                             ID   Status
                >       ------ ----                             ---- ------
                >     node1-hb                                  1
        Online, Local
                >     node2-hb                               2 Offline

                >       /dev/mapper/vg_qdisk-lv_qdisk               0
        Offline
                >
                >     And in /var/log/messages I can see this errors:
                >
                >     Jun 20 06:02:12 node1 openais[6098]: [TOTEM]
        entering OPERATIONAL state.

                >     Jun 20 06:02:12 node1 openais[6098]: [CLM  ] got
        nodejoin message
                >     15.15.2.10
                >     Jun 20 06:02:13 node1 dlm_controld[5386]: connect
        to ccs error -111,
                >     check ccsd or cluster status

                >     Jun 20 06:02:13 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.
                >     Jun 20 06:02:13 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused
                >     Jun 20 06:02:13 node1 ccsd[6090]: Initial status::
        Inquorate

                >     Jun 20 06:02:13 node1 gfs_controld[5392]: connect
        to ccs error -111,
                >     check ccsd or cluster status
                >     Jun 20 06:02:13 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.

                >     Jun 20 06:02:13 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused
                >     Jun 20 06:02:14 node1 openais[6098]: [TOTEM]
        entering GATHER state
                >     from 9.
                >     Jun 20 06:02:14 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing

                >     connection.
                >     Jun 20 06:02:14 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused
                >     Jun 20 06:02:14 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.

                >     Jun 20 06:02:14 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused
                >     Jun 20 06:02:15 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.
                >     Jun 20 06:02:15 node1 ccsd[6090]: Error while
        processing connect:

                >     Connection refused
                >     Jun 20 06:02:15 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.
                >     Jun 20 06:02:15 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused

                >     Jun 20 06:02:15 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.
                >     Jun 20 06:02:15 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused
                >     Jun 20 06:02:16 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing

                >     connection.
                >     Jun 20 06:02:16 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused
                >     Jun 20 06:02:16 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.

                >     Jun 20 06:02:16 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused
                >     Jun 20 06:02:17 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.
                >     Jun 20 06:02:17 node1 ccsd[6090]: Error while
        processing connect:

                >     Connection refused
                >     Jun 20 06:02:17 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.
                >     Jun 20 06:02:17 node1 ccsd[6090]: Error while
        processing connect:
                >     Connection refused

                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        entering GATHER state
                >     from 0.
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        Creating commit token
                >     because I am the rep.
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        Storing new sequence id

                >     for ring 15c
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        entering COMMIT state.
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        entering RECOVERY state.
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        position [0] member

                >     15.15.2.10 <http://15.15.2.10>:
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        previous ring seq 344
                >     rep 15.15.2.10
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM] aru e
        high delivered e

                >     received flag 1
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM] Did
        not need to
                >     originate any messages in recovery.
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        Sending initial ORF token

                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        entering OPERATIONAL state.
                >     Jun 20 06:02:18 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.
                >     Jun 20 06:02:18 node1 ccsd[6090]: Error while
        processing connect:

                >     Connection refused
                >     Jun 20 06:02:18 node1 openais[6098]: [TOTEM]
        entering GATHER state
                >     from 9.
                >     Jun 20 06:02:18 node1 ccsd[6090]: Cluster is not
        quorate.  Refusing
                >     connection.

                >
                >     And the quorum disk:
                >
                >     [root@node2 ~]# mkqdisk -L -d
                >     kqdisk v0.6.0
                >     /dev/mapper/vg_qdisk-lv_qdisk:
                >     /dev/vg_qdisk/lv_qdisk:
                >              Magic:                eb7a62c2

                >              Label:                cluster_qdisk
                >              Created:              Thu Jun  7 09:23:34
        2012
                >              Host:                 node1
                >              Kernel Sector Size:   512

                >              Recorded Sector Size: 512
                >
                >     Status block for node 1
                >              Last updated by node 2
                >              Last updated on Wed Jun 20 06:17:23 2012
                >              State: Evicted

                >              Flags: 0000
                >              Score: 0/0
                >              Average Cycle speed: 0.000500 seconds
                >              Last Cycle speed: 0.000000 seconds
                >              Incarnation: 4fe1a06c4fe1a06c

                >     Status block for node 2
                >              Last updated by node 2
                >              Last updated on Wed Jun 20 07:09:38 2012
                >              State: Master
                >              Flags: 0000
                >              Score: 0/0

                >              Average Cycle speed: 0.001000 seconds
                >              Last Cycle speed: 0.000000 seconds
                >              Incarnation: 4fe1a06c4fe1a06c
                >
                >
                >     In the other node I don't see any errors in
        /var/log/messages. One

                >     strange thing is that if I start cman on both
        nodes at the same
                >     time, everything works fine and both nodes quorate
        (until I reboot
                >     one node and the problem appears). I've checked
        that multicast is

                >     working properly. With iperf I can send a receive
        multicast paquets.
                >     Moreover I've seen with tcpdump the paquets that
        openais send when
                >     cman is trying to start. I've readed about a bug
        in RH 5.3 with the

                >     same behaviour, but it is solved in RH 5.4.
                >
                >     I don't have Selinux enabled, and Iptables are
        also disabled. Here
                >     is the cluster.conf simplified (with less services
        and resources). I

                >     want to point out one thing. I have allow_kill="0"
        in order to avoid
                >     fencing errors when quorum tries to fence a failed
        node. As <fence/>
                >     is empty, before this stanza I got a lot of
        messages in

                >     /var/log/messages with failed fencing.
                >
                >     <?xml version="1.0"?>
                >     <cluster alias="test_cluster" config_version="15"
        name="test_cluster">

                >              <fence_daemon clean_start="0"
        post_fail_delay="0"
                >     post_join_delay="-1"/>
                >              <clusternodes>
                >                      <clusternode name="node1-hb"
        nodeid="1" votes="1">

                >                              <fence/>
                >                      </clusternode>
                >                      <clusternode name="node2-hb"
        nodeid="2" votes="1">
                >                              <fence/>

                >                      </clusternode>
                >              </clusternodes>
                >              <cman two_node="0" expected_votes="3"/>
                >              <fencedevices/>

                >
                >              <rm log_facility="local4" log_level="7">
                >                      <failoverdomains>
                >                              <failoverdomain
        name="etest_cluster_fo"

                >     nofailback="1" ordered="1" restricted="1">
                >
          <failoverdomainnode name="node1-hb"
                >     priority="1"/>

                >
          <failoverdomainnode name="node2-hb"
                >     priority="2"/>
                >                              </failoverdomain>
                >                      </failoverdomains>

                >              <resources/>
                >              <service autostart="1"
        domain="test_cluster_fo"
                >     exclusive="0" name="postgres" recovery="relocate">

                >                      <ip address="172.24.119.44"
        monitor_link="1"/>
                >                      <lvm name="vg_postgres"
        vg_name="vg_postgres"
                >     lv_name="postgres"/>

                >
                >                      <fs
        device="/dev/vg_postgres/__postgres"
                >     force_fsck="1" force_unmount="1" fstype="ext3"
                >     mountpoint="/var/lib/pgsql" name="postgres"
        self_fence="0"/>

                >
                >                      <script
        file="/etc/init.d/postgresql" name="postgres">
                >                      </script>
                >              </service>
                >              </rm>

                >              <totem consensus="4000" join="60"
        token="20000"
                >     token_retransmits_before_loss___const="20"/>
                >          <quorumd allow_kill="0" interval="1"
        label="cluster_qdisk"

                >     tko="10" votes="1">
                >                      <heuristic
                >     program="/usr/share/cluster/__check_eth_link.sh
        eth0" score="1"
                >     interval="2" tko="3"/>

                >              </quorumd>
                >       </cluster>
                >
                >
                >     The /etc/hosts:
                >     172.24.119.10 node1
                >     172.24.119.34 node2
                >     15.15.2.10 node1-hb node1-hb.localdomain

                >     15.15.2.11 node2-hb node2-hb.localdomain
                >
                >     And the versions:
                >     Red Hat Enterprise Linux Server release 5.7 (Tikanga)
                >     cman-2.0.115-85.el5
                >     rgmanager-2.0.52-21.el5

                >     openais-0.80.6-30.el5
                >
                >     I don't know what else I should try, so if you can
        give me some
                >     ideas, I will be very pleased.
                >
                >     Regards, Javi.
                >
                >     --

                >     Linux-cluster mailing list
                >Linux-cluster@xxxxxxxxxx
        <mailto:Linux-cluster@xxxxxxxxxx>
          <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>>
          <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>
          <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>>>

                >https://www.redhat.com/__mailman/listinfo/linux-cluster
        <https://www.redhat.com/mailman/listinfo/linux-cluster>

                >
                >
                >
                >
                > --
                > esta es mi vida e me la vivo hasta que dios quiera
                >
                > -- Linux-cluster mailing listLinux-cluster@xxxxxxxxxx
        <mailto:listLinux-cluster@xxxxxxxxxx>
          <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>>

                > <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>
          <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>>>

                >https://www.redhat.com/__mailman/listinfo/linux-cluster
        <https://www.redhat.com/mailman/listinfo/linux-cluster>
                >
                >
                > --
                > Linux-cluster mailing list
                >Linux-cluster@xxxxxxxxxx
        <mailto:Linux-cluster@xxxxxxxxxx>
          <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>>

                >https://www.redhat.com/__mailman/listinfo/linux-cluster
        <https://www.redhat.com/mailman/listinfo/linux-cluster>
                >

                --
                Digimer

                Papers and Projects:https://alteeve.com

            --
            Linux-cluster mailing list
        Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx>
        <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>>
        https://www.redhat.com/__mailman/listinfo/linux-cluster
        <https://www.redhat.com/mailman/listinfo/linux-cluster>

        --
        esta es mi vida e me la vivo hasta que dios quiera

        --
        Linux-cluster mailing list
        Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx>
        https://www.redhat.com/__mailman/listinfo/linux-cluster
        <https://www.redhat.com/mailman/listinfo/linux-cluster>

    --
    Digimer
    Papers and Projects: https://alteeve.com

    --
    Linux-cluster mailing list
    Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx>
    https://www.redhat.com/__mailman/listinfo/linux-cluster
    <https://www.redhat.com/mailman/listinfo/linux-cluster>

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Digimer
Papers and Projects: https://alteeve.com

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster