Re: Quorum device brain the cluster when master lose network

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi 

can you do following changes in your cluster.conf file 


        <cman expected_votes="5" quorum_dev_poll="25000" >
                <multicast addr="ZZ.ZZ.ZZ.ZZ"/>
        </cman>
<totem token="30000" consensus="15000" />
        <quorumd interval="3" label="quorum" tko="7" votes="1" min_score="1" >
<heuristic program="/bin/ping -c 1 -W 4 10.148.8.1" score="1" interval="2" tko="2"/>
</quorumd>

and restart the cluster & check 
Also capture cman_tool status & paste the /var/log/message 


On Mon, Aug 13, 2012 at 9:24 PM, GouNiNi <gounini.geekarea@xxxxxxxxx> wrote:
Sorry, serveur was not usable until now.

[root@mynode1 ~]# cman_tool status
Version: 6.2.0
Config Version: 162
Cluster Name: cluname
Cluster Id: 57462
Cluster Member: Yes
Cluster Generation: 836
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Quorum device votes: 1
Total votes: 5
Quorum: 3
Active subsystems: 9
Flags: Dirty
Ports Bound: 0 177
Node name: mynode1
Node ID: 1
Multicast addresses: XX.XX.XX.XX
Node addresses: YY.YY.YY.YY

I reproduce my problem today.
I tried to use a speciel heuristic to leave quorum device but it's not working:
<heuristic program="/bin/ping -c1 -w1 10.148.8.1 || /etc/init.d/qdiskd stop" score="1" interval="2" tko="2"/>

An idea?

--
  .`'`.   GouNiNi
 :  ': :
 `. ` .`  GNU/Linux
   `'`    http://www.geekarea.fr


----- Mail original -----
> De: "emmanuel segura" <emi2fast@xxxxxxxxx>
> À: "linux clustering" <linux-cluster@xxxxxxxxxx>
> Envoyé: Mardi 7 Août 2012 14:31:13
> Objet: Re: Quorum device brain the cluster when master        lose network
>
>
> send me a cman_tool status ;-)
>
>
> 2012/8/7 GouNiNi < gounini.geekarea@xxxxxxxxx >
>
>
> Yes I do ;)
>
> --
> .`'`. GouNiNi
> : ': :
> `. ` .` GNU/Linux
> `'` http://www.geekarea.fr
>
>
> ----- Mail original -----
> > De: "emmanuel segura" < emi2fast@xxxxxxxxx >
> > À: "linux clustering" < linux-cluster@xxxxxxxxxx >
> > Envoyé: Mardi 7 Août 2012 11:29:59
> > Objet: Re: Quorum device brain the cluster when
> > master lose network
> >
> >
> > do you reboot all nodes in your cluster after removed the
> > expected_votes?
> >
> >
> > 2012/8/7 GouNiNi < gounini.geekarea@xxxxxxxxx >
> >
> >
> > Hello,
> >
> > My problem is still here.
> > I made a try without expected_votes="5" but nothing change on my
> > test
> > loosing network on two nodes.
> >
> > Any other idea?
> >
> > Regards,
> >
> >
> > --
> > .`'`. GouNiNi
> > : ': :
> > `. ` .` GNU/Linux
> > `'` http://www.geekarea.fr
> >
> >
> > ----- Mail original -----
> > > De: "emmanuel segura" < emi2fast@xxxxxxxxx >
> > > À: "linux clustering" < linux-cluster@xxxxxxxxxx >
> > > Envoyé: Mercredi 1 Août 2012 10:58:59
> >
> >
> > > Objet: Re: Quorum device brain the cluster when
> > > master lose network
> > >
> > >
> > > Hello Gounini
> > >
> > > Sorry but it told you, remove <cman expected_votes="5"> and
> > > reboot
> > > the cluster
> > >
> > > Let the cluster calculate the expected votes
> > >
> > >
> > > 2012/8/1 GouNiNi < gounini.geekarea@xxxxxxxxx >
> > >
> > >
> > > I do this test one more time and I got same result with more
> > > precisions:
> > >
> > > When I shutdown network on 2 nodes including the master, master
> > > stay
> > > alive while the 2 online nodes are fencing the offline non-master
> > > node. The cluster goes Inquorate after.
> > > When fenced node came back, he joins cluster and cluster becomes
> > > quorate. New master is chose and the old master is fenced.
> > >
> > > # cman_tool status
> > > Version: 6.2.0
> > > Config Version: 144
> > > Cluster Name: cluname
> > > Cluster Id: 57462
> > > Cluster Member: Yes
> > > Cluster Generation: 488
> > > Membership state: Cluster-Member
> > > Nodes: 4
> > > Expected votes: 5
> > > Quorum device votes: 1
> > > Total votes: 5
> > > Quorum: 3
> > > Active subsystems: 9
> > > Flags: Dirty
> > > Ports Bound: 0 177
> > > Node name: nodename
> > > Node ID: 2
> > > Multicast addresses: ZZ.ZZ.ZZ.ZZ
> > > Node addresses: YY.YY.YY.YY
> > >
> > > --
> > > .`'`. GouNiNi
> > > : ': :
> > > `. ` .` GNU/Linux
> > > `'` http://www.geekarea.fr
> > >
> > >
> > > ----- Mail original -----
> > > > De: "emmanuel segura" < emi2fast@xxxxxxxxx >
> > > > À: "linux clustering" < linux-cluster@xxxxxxxxxx >
> > > > Envoyé: Lundi 30 Juillet 2012 17:35:39
> > > > Objet: Re: Quorum device brain the cluster when
> > > > master lose network
> > > >
> > > >
> > > > can you send me the ouput from cman_tool status? when the
> > > > cluster
> > > > it's running
> > > >
> > > >
> > > > 2012/7/30 GouNiNi < gounini.geekarea@xxxxxxxxx >
> > > >
> > > >
> > > >
> > > >
> > > > ----- Mail original -----
> > > > > De: "Digimer" < lists@xxxxxxxxxx >
> > > > > À: "linux clustering" < linux-cluster@xxxxxxxxxx >
> > > > > Cc: "GouNiNi" < gounini.geekarea@xxxxxxxxx >
> > > > > Envoyé: Lundi 30 Juillet 2012 17:10:10
> > > > > Objet: Re: Quorum device brain the cluster
> > > > > when
> > > > > master lose network
> > > > >
> > > > > On 07/30/2012 10:43 AM, GouNiNi wrote:
> > > > > > Hello,
> > > > > >
> > > > > > I did some tests on 4 nodes cluster with quorum device and
> > > > > > I
> > > > > > find
> > > > > > a
> > > > > > bad situation with one test, so I need your knowledges to
> > > > > > correct
> > > > > > my configuration.
> > > > > >
> > > > > > Configuation:
> > > > > > 4 nodes, all vote for 1
> > > > > > quorum device vote for 1 (to hold services with minimum 2
> > > > > > nodes
> > > > > > up)
> > > > > > cman expected votes 5
> > > > > >
> > > > > > Situation:
> > > > > > I shut down network on 2 nodes, one of them is master.
> > > > > >
> > > > > > Observation:
> > > > > > Fencing of one node (the master)... Quorum device Offline,
> > > > > > Quorum
> > > > > > disolved ! Services stopped.
> > > > > > Fenced node reboot, cluster is quorate, 2nd offline node is
> > > > > > fenced.
> > > > > > Services restart.
> > > > > > 2nd node offline reboot.
> > > > > >
> > > > > > My cluster is not quorate for 8 min (very long hardware
> > > > > > boot
> > > > > > :-)
> > > > > > and my services were offline.
> > > > > >
> > > > > > Do you know how to prevent this situation?
> > > > > >
> > > > > > Regards,
> > > > >
> > > > > Please tell us the name and version of the cluster software
> > > > > you
> > > > > are
> > > > > using, Please also share your configuration file(s).
> > > > >
> > > > > --
> > > > > Digimer
> > > > > Papers and Projects: https://alteeve.com
> > > > >
> > > >
> > > > Sorry, RHEL5.6 64bits
> > > >
> > > > # rpm -q cman rgmanager
> > > > cman-2.0.115-68.el5
> > > > rgmanager-2.0.52-9.el5
> > > >
> > > >
> > > > <?xml version="1.0"?>
> > > > <cluster alias="cluname" config_version="144" name="cluname">
> > > > <clusternodes>
> > > > <clusternode name="node1" nodeid="1" votes="1">
> > > > <fence>
> > > > <method name="single">
> > > > <device name="fenceIBM_307" port="12"/>
> > > > </method>
> > > > </fence>
> > > > </clusternode>
> > > > <clusternode name="node2" nodeid="2" votes="1">
> > > > <fence>
> > > > <method name="single">
> > > > <device name="fenceIBM_307" port="11"/>
> > > > </method>
> > > > </fence>
> > > > </clusternode>
> > > > <clusternode name="node3" nodeid="3" votes="1">
> > > > <fence>
> > > > <method name="single">
> > > > <device name="fenceIBM_308" port="6"/>
> > > > </method>
> > > > </fence>
> > > > </clusternode>
> > > > <clusternode name="node4" nodeid="4" votes="1">
> > > > <fence>
> > > > <method name="single">
> > > > <device name="fenceIBM_308" port="7"/>
> > > > </method>
> > > > </fence>
> > > > </clusternode>
> > > > </clusternodes>
> > > > <fencedevices>
> > > > <fencedevice agent="fence_bladecenter" ipaddr="XX.XX.XX.XX"
> > > > login="xxxx" name="fenceIBM_307" passwd="yyyy"/>
> > > > <fencedevice agent="fence_bladecenter" ipaddr="YY.YY.YY.YY"
> > > > login="xxxx" name="fenceIBM_308" passwd="yyyy"/>
> > > > </fencedevices>
> > > > <rm log_level="7">
> > > > <failoverdomains/>
> > > > <resources/>
> > > > <service ...>
> > > > <...>
> > > > </service>
> > > > </rm>
> > > > <fence_daemon clean_start="0" post_fail_delay="15"
> > > > post_join_delay="300"/>
> > > > <cman expected_votes="5">
> > > > <multicast addr="ZZ.ZZ.ZZ.ZZ"/>
> > > > </cman>
> > > > <quorumd interval="7" label="quorum" tko="12" votes="1"/>
> > > > </cluster>
> > > >
> > > > --
> > > > Linux-cluster mailing list
> > > > Linux-cluster@xxxxxxxxxx
> > > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > > >
> > > >
> > > > --
> > > > esta es mi vida e me la vivo hasta que dios quiera
> > > >
> > > > --
> > > > Linux-cluster mailing list
> > > > Linux-cluster@xxxxxxxxxx
> > > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster@xxxxxxxxxx
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > >
> > > --
> > > esta es mi vida e me la vivo hasta que dios quiera
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster@xxxxxxxxxx
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster@xxxxxxxxxx
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > --
> > esta es mi vida e me la vivo hasta que dios quiera
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster@xxxxxxxxxx
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> --
> Linux-cluster mailing list
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux