Re: Error in Cluster.conf

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/24/2014 08:44 PM, Amjad Syed wrote:
I have updated the config file ,  validated by ccs_config_validate

Added the fence_daemon and post_join_delay. I am using bonding using ethernet coaxial cable.
ok, check bonding modes supported ( depends what OS used - different for rhel ( Centos ) 5 / RHEL ( CentoOS ) 6
But for some reason whenever i start CMAN on node, it fences (kicks the other node). As a result at a time only one node is online . Do i need to use multicast to get both nodes online at same instance ?.
it would be good to see logs at from surviving node before it decide to fence its peer. That said, boot machines at same time and see what is happening in logs, there will be reason ( on surviving node logs ) why it thinks that its peer is not in good state so it needs to be fenced.

mutlicast is used by default and that traffic needs to be allowed in cluster network. You can rule out issue with muticast if you for test purposes in cluster.conf change
        <cman expected_votes="1" two_node="1"/>
to
        <cman expected_votes="1" broadcast="yes" two_node="1"/>

if issue is not visible with broadcast="yes" then you can say that multicast could be  issue  ( and then you can work to fix that ). If you have RHEL 6 / CentOS you can also try with unicast udp ( udpu, by adding transport="udpu" in above cman stanza , more in doc : https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/s1-unicast-traffic-CA.html )

Also you must ensure that fencing is working properly, I recommend to take time and to read : https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Fence_Configuration_Guide/index.html

If still there is issue after all your tests with cluster and if you have valid Red Hat subscription ( with proper support level : Standard or Premium ) then you can visit Red Hat Customer portal https://access.redhat.com and open case with Red Hat Support where we can work to fix issue.

Kind regards,

Elvir Kuric

or i am missing something here ?

Now the file looks like this :


?xml version="1.0"?>
<cluster config_version="2" name="oracleha">
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
           <fencedevice agent= "fence_ipmilan" ipaddr="10.10.63.93" login="ADMIN" name="inspuripmi"  passwd="xxxx"/>
           <fencedevice agent = "fence_ilo2" ipaddr="10.10.63.92" login="test" name="hpipmi"  passwd="xxxx"/>
          </fencedevices>
          <fence_daemon post_fail_delay="0" post_join_delay="60"/>
        <clusternodes>
           <clusternode name= "krplporcl001"  nodeid="1" votes= "1">
           <fence>
               <method name  = "1">
                 <device lanplus = "" name="inspuripmi"  action =""/>
                 </method>
            </fence>
           </clusternode>
            <clusternode name = "krplporcl002" nodeid="2" votes ="1">
                 <fence>
                 <method name = "1">
                  <device lanplus = "" name="hpipmi" action =""/>
                   </method>
              </fence>
            </clusternode>
         </clusternodes>


        <rm>

          <failoverdomains/>
        <resources/>
        <service autostart="1" exclusive="0" name="IP" recovery="relocate">
                <ip address="10.10.5.23" monitor_link="on" sleeptime="10"/>
        </service>
</rm>
</cluster>

Thanks


On Tue, Jun 24, 2014 at 6:46 PM, Digimer <lists@xxxxxxxxxx> wrote:
On 24/06/14 08:55 AM, Jan Pokorný wrote:
On 24/06/14 13:56 +0200, Fabio M. Di Nitto wrote:
On 6/24/2014 12:32 PM, Amjad Syed wrote:
Hello

I am getting the following error when i run ccs_config_Validate

ccs_config_validate
Relax-NG validity error : Extra element clusternodes in interleave

You defined <clusternodes.. twice.

That + the are more issues discoverable by more powerful validator
jing (packaged in Fedora and RHEL 7, for instance, admittedly not
for RHEL 6/EPEL):

$ jing cluster.rng cluster.conf
cluster.conf:13:47: error:
   element "fencedvice" not allowed anywhere; expected the element
   end-tag or element "fencedevice"
cluster.conf:15:23: error:
   element "clusternodes" not allowed here; expected the element
   end-tag or element "clvmd", "dlm", "fence_daemon", "fence_xvmd",
   "gfs_controld", "group", "logging", "quorumd", "rm", "totem" or
   "uidgid"
cluster.conf:26:76: error:
   IDREF "fence_node2" without matching ID
cluster.conf:19:77: error:
   IDREF "fence_node1" without matching ID

So it spotted also:
- a typo in "fencedvice"
- broken referential integrity; it is prescribed "name" attribute
   of "device" tag should match a "name" of a defined "fencedevice"

Hope this helps.

-- Jan

Also, without fence methods defined for the nodes, rgmanager will block the first time there is an issue.

--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without access to education?





-- 
Elvir Kuric,TSE / Red Hat / GSS EMEA / 
-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux