Re: GFS problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-The cluster.conf lists more than 12 nodes, if there're redundant nodes then you may need to clean up cluster.conf just in case
-Why expected_votes="8"? expected_votes should be the total votes in a fully functioning cluster, in your case it should be '12' the quorum would be calculated automatically by the basic formula (1/2 expected_votes_number + 1), so in the case of 12 votes (1 vote/node) the quorum would be 7, in other words the cluster would be kept running as long as there's 7 nodes (because in your case 1 vote per node). 
-I'd change post_fail_delay="0" to 5 (seconds)

If still no luck then try this line in your cluster.conf file:
<logging debug="on" logfile="/var/log/rhcs.log" to_file="yes"/>

Good luck,

   -- Abraham

On 27/01/2010, at 11:22 PM, Alex Urbanowicz wrote:

From: Jorge Palma <jpalmae@xxxxxxxxx>
To: linux clustering <linux-cluster@xxxxxxxxxx>
Subject: Re: GFS problem
Message-ID:
       <5b65f1b11001251343p659d3b96gf07dd2165adf521e@xxxxxxxxxxxxxx>
Content-Type: text/plain; charset=ISO-8859-1

Please send your fence configuration and cluster.conf

cluster.conf:

<?xml version="1.0"?>
<!--
** puppet managed file $Revision: 2889 $
-->
<cluster config_version="14" name="gfs-filmweb">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="www1" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="www1" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <!-- if method 1 happen to fail - use method 2 -->
                                <method name="2">
                                        <device name="manual" nodename="www1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="www2" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="www2" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="www2"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app1" nodeid="65" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="app1" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="app1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app2" nodeid="66" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="app2" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="app2"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app3" nodeid="67" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="app3" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="app3"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app4" nodeid="68" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="app4" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="app4"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app5" nodeid="69" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="app5" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="app5"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app6" nodeid="70" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="app6" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="app6"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app7" nodeid="71" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="app7" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="app7"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="blade403" nodeid="72" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="blade403" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="blade403"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="blade404" nodeid="73" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="blade404" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="blade404"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="blade405" nodeid="74" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="blade405" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="blade405"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="blade406" nodeid="75" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="blade406" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="blade406"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="blade407" nodeid="76" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="blade407" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="blade407"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="blade408" nodeid="77" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="rsysrq" nodename="blade408" password="fencepassword" port="9" operation="1bbbb"/>
                                </method>
                                <method name="2">
                                        <device name="manual" nodename="blade408"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="8" two_node="0"/>
        <fencedevices>
                <fencedevice agent="fence_rsysrq" name="rsysrq"/>
                <fencedevice agent="fence_manual" name="manual"/>
        </fencedevices>
        <rm>
                <failoverdomains/>
                <resources/>
        </rm>
</cluster>

fencing is done using fence_rsysrq so there is no configuration to speak of except the iptables/modprobe part:

options ipt_SYSRQ passwd="fencepassword" tolerance=3720

-A INPUT  -i bond0.108 -s 10.100.108.0/24 -d <hostip> -p udp -m udp --dport 9 -j SYSRQ

Alex.
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

''''''''''''''''''''''''''''''''''''''''''''''''''''''
Abraham Alawi

Unix/Linux Systems Administrator
Science IT
University of Auckland
e: a.alawi@xxxxxxxxxxxxxx
p: +64-9-373 7599, ext#: 87572

''''''''''''''''''''''''''''''''''''''''''''''''''''''

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux