Re: how to handle fence for a simple apache active/passive cluster with virtual ip on 2 virtual machine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



No. When a node is lost, fenced is called. Fenced informs DLM that a fence is pending and DLM stops issuing locks. Only after fenced confirms successful fence is DLM told. The DLM will reap locks held by the now-fenced node and recovery can begin.

Anything using DLM; rgmanager, clvmd, gfs2, will block. This is by design. If you ever allowed a cluster to make an assumption about the state of a lost node, you risk a split-brain. If a split-brain was tolerable, you wouldn't need an HA cluster. :)

digimer

On 01/02/14 04:11 PM, nik600 wrote:
Ok but is not possible to ignore fence?

Il 01/feb/2014 22:09 "Digimer" <lists@xxxxxxxxxx
<mailto:lists@xxxxxxxxxx>> ha scritto:

    Ooooh, I'm not sure what option you have then. I suppose
    fence_virtd/fence_xvm is your best option, but you're going to need
    to have the admin configure the fence_virtd side.

    On 01/02/14 03:50 PM, nik600 wrote:

        My problem is that i don't have root access at host level.

        Il 01/feb/2014 19:49 "Digimer" <lists@xxxxxxxxxx
        <mailto:lists@xxxxxxxxxx>
        <mailto:lists@xxxxxxxxxx <mailto:lists@xxxxxxxxxx>>> ha scritto:

             On 01/02/14 01:35 PM, nik600 wrote:

                 Dear all

                 i need some clarification about clustering with rhel 6.4

                 i have a cluster with 2 node in active/passive
        configuration, i
                 simply
                 want to have a virtual ip and migrate it between 2 nodes.

                 i've noticed that if i reboot or manually shut down a
        node the
                 failover
                 works correctly, but if i power-off one node the
        cluster doesn't
                 failover on the other node.

                 Another stange situation is that if power off all the
        nodes and then
                 switch on only one the cluster doesn't start on the
        active node.

                 I've read manual and documentation at

        https://access.redhat.com/____site/documentation/en-US/Red_____Hat_Enterprise_Linux/6/html/____Cluster_Administration/index.____html
        <https://access.redhat.com/__site/documentation/en-US/Red___Hat_Enterprise_Linux/6/html/__Cluster_Administration/index.__html>

        <https://access.redhat.com/__site/documentation/en-US/Red___Hat_Enterprise_Linux/6/html/__Cluster_Administration/index.__html
        <https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/index.html>>

                 and i've understand that the problem is related to
        fencing, but the
                 problem is that my 2 nodes are on 2 virtual machine , i
        can't
                 control
                 hardware and can't issue any custom command on the
        host-side.

                 I've tried to use fence_xvm but i'm not sure about it
        because if
                 my VM
                 has powered-off, how can it reply to fence_vxm messags?

                 Here my logs when i power off the VM:

                 ==> /var/log/cluster/fenced.log <==
                 Feb 01 18:50:22 fenced fencing node mynode02
                 Feb 01 18:50:53 fenced fence mynode02 dev 0.0 agent
        fence_xvm
                 result:
                 error from agent
                 Feb 01 18:50:53 fenced fence mynode02 failed

                 I've tried to force the manual fence with:

                 fence_ack_manual mynode02

                 and in this case the failover works properly.

                 The point is: as i'm not using any shared filesystem
        but i'm only
                 sharing apache with a virtual ip, i won't have any
        split-brain
                 scenario
                 so i don't need fencing, or not?

                 So, is there the possibility to have a simple "dummy"
        fencing?

                 here is my config.xml:

                 <?xml version="1.0"?>
                 <cluster config_version="20" name="hacluster">
                           <fence_daemon clean_start="0" post_fail_delay="0"
                 post_join_delay="0"/>
                           <cman expected_votes="1" two_node="1"/>
                           <clusternodes>
                                   <clusternode name="mynode01"
        nodeid="1" votes="1">
                                           <fence>
                                                   <method name="mynode01">
                                                           <device
        domain="mynode01"
                 name="mynode01"/>
                                                   </method>
                                           </fence>
                                   </clusternode>
                                   <clusternode name="mynode02"
        nodeid="2" votes="1">
                                           <fence>
                                                   <method name="mynode02">
                                                           <device
        domain="mynode02"
                 name="mynode02"/>
                                                   </method>
                                           </fence>
                                   </clusternode>
                           </clusternodes>
                           <fencedevices>
                                   <fencedevice agent="fence_xvm"
        name="mynode01"/>
                                   <fencedevice agent="fence_xvm"
        name="mynode02"/>
                           </fencedevices>
                           <rm log_level="7">
                                   <failoverdomains>
                                           <failoverdomain name="MYSERVICE"
                 nofailback="0"
                 ordered="0" restricted="0">
                                                   <failoverdomainnode
                 name="mynode01"
                 priority="1"/>
                                                   <failoverdomainnode
                 name="mynode02"
                 priority="2"/>
                                           </failoverdomain>
                                   </failoverdomains>
                                   <resources/>
                                   <service autostart="1" exclusive="0"
                 name="MYSERVICE"
                 recovery="relocate">
                                           <ip address="192.168.1.239"
                 monitor_link="on"
                 sleeptime="2"/>
                 <apache config_file="conf/httpd.conf" name="apache"
                 server_root="/etc/httpd" shutdown_wait="0"/>
                                   </service>
                           </rm>
                 </cluster>

                 Thanks to all in advance.


             The fence_virtd/fence_xvm agent works by using multicast to
        talk to
             the VM host. So the "off" confirmation comes from the
        hypervisor,
             not the target.

             Depending on your setup, you might find better luck with
        fence_virsh
             (I have to use this as there is a known multicast issue
        with Fedora
             hosts). Can you try, as a test if nothing else, if
        'fence_virsh'
             will work for you?

             fence_virsh -a <host ip> -l root -p <host root pw> -n
        <virsh name
             for target vm> -o status

             If this works, it should be trivial to add to cluster.conf.
        If that
             works, then you have a working fence method. However, I would
             recommend switching back to fence_xvm if you can. The
        fence_virsh
             agent is dependent on libvirtd running, which some consider
        a risk.

             hth

             --
             Digimer
             Papers and Projects: https://alteeve.ca/w/
             What if the cure for cancer is trapped in the mind of a person
             without access to education?

             --
             Linux-cluster mailing list
        Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx>
        <mailto:Linux-cluster@redhat.__com
        <mailto:Linux-cluster@xxxxxxxxxx>>
        https://www.redhat.com/____mailman/listinfo/linux-cluster
        <https://www.redhat.com/__mailman/listinfo/linux-cluster>
             <https://www.redhat.com/__mailman/listinfo/linux-cluster
        <https://www.redhat.com/mailman/listinfo/linux-cluster>__>





    --
    Digimer
    Papers and Projects: https://alteeve.ca/w/
    What if the cure for cancer is trapped in the mind of a person
    without access to education?

    --
    Linux-cluster mailing list
    Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx>
    https://www.redhat.com/__mailman/listinfo/linux-cluster
    <https://www.redhat.com/mailman/listinfo/linux-cluster>





--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without access to education?

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster




[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux