Thank's Junra, I did it , and i tried to restart the cman service without more success, thise are the messages i got: my new cluster.conf: <?xml version="1.0"?> <cluster alias="arevclust" config_version="1" name="arevclust"> <clusternodes> <cman expected_votes="1" two_node="1"> </cman> <clusternode name="gs21spli003.occ.lan" nodeid="1" votes="1"> </clusternode> <clusternode name="gs21spli004.occ.lan" nodeid="2" votes="1"> </clusternode> </clusternodes> </cluster> ~ j cman not started: Multicast and node address families differ. /usr/sbin/cman_tool: aisexec daemon didn't start when i mounted the gfs FS i got this: # mount -t gfs2 /dev/mapper/VolGroup01-LogVol01 /appli/prod /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused I 'm not sure but i have a doubt on the lockTableName during the FS GFS creation: mkfs.gfs2 -p lock_dlm -t arevclust:appli/prod -j 1 /dev/mapper/VolGroup00-LogVol01, the name of the cluster is: arevclust The gfs File system is made by /dev/mapper/volGroup00-LogVol01 and the mounting point is /appli/prod, so if i have to precise my gfs file i put : appli/prod is it correct? regards ntoughe@xxxxxxxxxxx > From: linux-cluster-request@xxxxxxxxxx > Subject: Linux-cluster Digest, Vol 64, Issue 18 > To: linux-cluster@xxxxxxxxxx > Date: Thu, 13 Aug 2009 11:24:21 -0400 > > Send Linux-cluster mailing list submissions to > linux-cluster@xxxxxxxxxx > > To subscribe or unsubscribe via the World Wide Web, visit > https://www.redhat.com/mailman/listinfo/linux-cluster > or, via email, send a message with subject or body 'help' to > linux-cluster-request@xxxxxxxxxx > > You can reach the person managing the list at > linux-cluster-owner@xxxxxxxxxx > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Linux-cluster digest..." > > > Today's Topics: > > 1. Re: Qdisk question (brem belguebli) > 2. Re: Cman hang (Juan Ramon Martin Blanco) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 13 Aug 2009 17:23:16 +0200 > From: brem belguebli <brem.belguebli@xxxxxxxxx> > Subject: Re: Qdisk question > To: linux clustering <linux-cluster@xxxxxxxxxx> > Message-ID: > <29ae894c0908130823i65667021vdc840ae1f0ded134@xxxxxxxxxxxxxx> > Content-Type: text/plain; charset="iso-8859-1" > > Hi Lon and Thanks for this reply. > > In fact, thinking about it, my test wasn't very much representative of what > I was expecting to do. > > I blocked the qdisk communications to only one node which, after reading > your reply, kind of confirmed me that I did the wong test. I'm going to re > run it by blocking all the nodes to the qdisk. > > I'll also try your ping tie-breaker. > > Brem > > > 2009/8/13, Lon Hohberger <lhh@xxxxxxxxxx>: > > > > On Thu, 2009-08-13 at 00:45 +0200, brem belguebli wrote: > > > > > My understanding of qdisk is that it is used as a tie-breaker, but it > > > looks like it is more a heatbeat vector than a simple tie-breaker. > > > > Right, it's a secondary membership algorithm. > > > > > > > Until here, no real problem indeed, if the site gets apart from the > > > other prod site and also from the third site (hosting the iscsi target > > > qdisk) the 2 nodes from the failing site get evicted from the cluster. > > > > > > > > > But, what if my third site gets isolated while the 2 prod ones are > > > fine ? > > > > Qdisk votes will not be presented to CMAN any more, but the two sites > > should remain online if they still have a "majority" of votes. > > > > > > > The real question is what happens in case all the nodes loose access > > > to the qdisk while they're still able to see each others ? > > > > Qdisk is just a vote like other voting mechanisms. If all nodes lose > > access at the same time, it should behave like a node death. However, > > the default action if _one_ node loses access is to kill that node (even > > if CMAN still sees it). > > > > > > > The 4 nodes have each 1 vote and the qdisk 1 vote. The expected quorum > > > is 3. > > > > > > > If I loose the qdisk, the number of votes falls to 4, the cluster is > > > quorate (4>3) but it looks like everything goes bad, each node > > > deactivate itself as it can't write its alive status (--> heartbeat > > > vector) to the qdisk even if the network heartbeating is working > > > fine. > > > > What happens specifically? Most of the actions qdiskd performs are > > configurable. For example, if the nodes are rebooting, you can turn > > that behavior off. > > > > > > > > I wrote a simple 'ping' tiebreaker based the behaviors in RHEL3. It > > functions in many ways in the same manner as qdiskd with respect to vote > > advertisement to CMAN, but without needing a disk - maybe you would find > > it useful? > > > > http://people.redhat.com/lhh/qnet.tar.gz > > > > -- Lon > > > > -- > > Linux-cluster mailing list > > Linux-cluster@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: https://www.redhat.com/archives/linux-cluster/attachments/20090813/5482c8bf/attachment.html > > ------------------------------ > > Message: 2 > Date: Thu, 13 Aug 2009 17:23:54 +0200 > From: Juan Ramon Martin Blanco <robejrm@xxxxxxxxx> > Subject: Re: Cman hang > To: linux clustering <linux-cluster@xxxxxxxxxx> > Message-ID: > <8a5668960908130823u11b46ad9pd1da5af3614ce3d3@xxxxxxxxxxxxxx> > Content-Type: text/plain; charset="iso-8859-1" > > On Thu, Aug 13, 2009 at 5:13 PM, NTOUGHE GUY-SERGE <ntoughe@xxxxxxxxxxx>wrote: > > > Hi, this is my cluster.conf > > <?xml version="1.0"?> > > <cluster alias="arevclust" config_version="21" name="arevclust"> > > <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> > > <clusternodes> > > <clusternode name="host1" nodeid="1" votes="1"> > > <fence> > > <method name="2"> > > > > > <device name=""/> > > > You should configure a valid fencing method, and if you don't have any, use > fence_manual until you get it. > > > > > </method> > > </fence> > > <multicast addr="" interface=""/> > > > I am not sure, but I think you should erase this <multicast ....> tag > > Greetings, > Juanra > > > > > </clusternode> > > > > > <clusternode name="host2" nodeid="2" votes="1"> > > <fence> > > <method name="1"> > > <device name=""/> > > </method> > > <method name=""/> > > </fence> > > <multicast addr="" interface=""/> > > </clusternode> > > </clusternodes> > > <cman expected_votes="" two_node=""> > > <multicast addr=""/> > > </cman> > > <fencedevices> > > <fencedevice agent="fence_brocade" ipaddr="" login="" name="" passwd=""/> > > </fencedevices> > > <rm> > > <failoverdomains> > > </failoverdomains> > > <resources> > > </resources> > > </rm> > > </cluster> > > > > Regards > > > > > > > > > > ntoughe@xxxxxxxxxxx > > > > > > > > > > > From: linux-cluster-request@xxxxxxxxxx > > > Subject: Linux-cluster Digest, Vol 64, Issue 16 > > > To: linux-cluster@xxxxxxxxxx > > > Date: Thu, 13 Aug 2009 11:02:36 -0400 > > > > > > Send Linux-cluster mailing list submissions to > > > linux-cluster@xxxxxxxxxx > > > > > > To subscribe or unsubscribe via the World Wide Web, visit > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > or, via email, send a message with subject or body 'help' to > > > linux-cluster-request@xxxxxxxxxx > > > > > > You can reach the person managing the list at > > > linux-cluster-owner@xxxxxxxxxx > > > > > > When replying, please edit your Subject line so it is more specific > > > than "Re: Contents of Linux-cluster digest..." > > > > > > > > > Today's Topics: > > > > > > 1. Re: do I have a fence DRAC device? (ESGLinux) > > > 2. clusterservice stays in 'recovering' state (mark benschop) > > > 3. Re: Is there any backup heartbeat channel (Hakan VELIOGLU) > > > 4. Re: Is there any backup heartbeat channel > > > (Juan Ramon Martin Blanco) > > > 5. RHCS on KVM (Nehemias Jahcob) > > > 6. Cman hang (NTOUGHE GUY-SERGE) > > > 7. Re: gfs2 mount hangs (David Teigland) > > > 8. Re: Qdisk question (Lon Hohberger) > > > 9. Re: Cman hang (Juan Ramon Martin Blanco) > > > > > > > > > ---------------------------------------------------------------------- > > > > > > Message: 1 > > > Date: Thu, 13 Aug 2009 13:27:16 +0200 > > > From: ESGLinux <esggrupos@xxxxxxxxx> > > > Subject: Re: [Linux-cluster] do I have a fence DRAC device? > > > To: linux clustering <linux-cluster@xxxxxxxxxx> > > > Message-ID: > > > <3128ba140908130427i6ab85406ye6da34073e6a6e97@xxxxxxxxxxxxxx> > > > Content-Type: text/plain; charset="iso-8859-1" > > > > > > Hi, > > > I couldn´t reboot my system yet but I have installed the openmanage > > > packages: > > > > > > srvadmin-omacore-5.4.0-260 > > > srvadmin-iws-5.4.0-260 > > > srvadmin-syscheck-5.4.0-260 > > > srvadmin-rac5-components-5.4.0-260 > > > srvadmin-deng-5.4.0-260 > > > srvadmin-ipmi-5.4.0-260.DUP > > > srvadmin-racadm5-5.4.0-260 > > > srvadmin-omauth-5.4.0-260.rhel5 > > > srvadmin-hapi-5.4.0-260 > > > srvadmin-cm-5.4.0-260 > > > srvadmin-racdrsc5-5.4.0-260 > > > srvadmin-omilcore-5.4.0-260 > > > srvadmin-isvc-5.4.0-260 > > > srvadmin-storage-5.4.0-260 > > > srvadmin-jre-5.4.0-260 > > > srvadmin-omhip-5.4.0-260 > > > > > > Now I have the command racadm but when I try to execut it I get this: > > > > > > racadm config -g cfgSerial -o cfgSerialTelnetEnable 1 > > > ERROR: RACADM is unable to process the requested subcommand because there > > is > > > no > > > local RAC configuration to communicate with. > > > > > > Local RACADM subcommand execution requires the following: > > > > > > 1. A Remote Access Controller (RAC) must be present on the managed server > > > 2. Appropriate managed node software must be installed and running on the > > > server > > > > > > > > > What do I need to install/start? or until I configure the bios I can´t > > get > > > this work? > > > > > > Greetings > > > > > > ESG > > > > > > > > > 2009/8/11 <bergman@xxxxxxxxxxxx> > > > > > > > > > > > > > > > In the message dated: Tue, 11 Aug 2009 14:14:03 +0200, > > > > The pithy ruminations from Juan Ramon Martin Blanco on > > > > <Re: do I have a fence DRAC device?> were: > > > > => --===============1917368601== > > > > => Content-Type: multipart/alternative; > > > > boundary=0016364c7c07663f600470dca3b8 > > > > => > > > > => --0016364c7c07663f600470dca3b8 > > > > => Content-Type: text/plain; charset=ISO-8859-1 > > > > => Content-Transfer-Encoding: quoted-printable > > > > => > > > > => On Tue, Aug 11, 2009 at 2:03 PM, ESGLinux <esggrupos@xxxxxxxxx> > > wrote: > > > > => > > > > => > Thanks > > > > => > I=B4ll check it when I could reboot the server. > > > > => > > > > > => > greetings, > > > > => > > > > > => You have a BMC ipmi in the first network interface, it can be > > configured > > > > at > > > > => boot time (I don't remember if inside the BIOS or pressing > > > > cntrl+something > > > > => during boot) > > > > => > > > > > > > > Based on my notes, here's how I configured the DRAC interface on a Dell > > > > 1950 > > > > for use as a fence device: > > > > > > > > Configuring the card from Linux depending on the installation of > > > > Dell's > > > > OMSA package. Once that's installed, use the following > > > > commands: > > > > > > > > racadm config -g cfgSerial -o cfgSerialTelnetEnable 1 > > > > racadm config -g cfgLanNetworking -o cfgDNSRacName > > > > HOSTNAME_FOR_INTERFACE > > > > racadm config -g cfgDNSDomainName DOMAINNAME_FOR_INTERFACE > > > > racadm config -g cfgUserAdmin -o cfgUserAdminPassword -i 2 > > > > PASSWORD > > > > racadm config -g cfgNicEnable 1 > > > > racadm config -g cfgNicIpAddress WWW.XXX.YYY.ZZZ > > > > racadm config -g cfgNicNetmask WWW.XXX.YYY.ZZZ > > > > racadm config -g cfgNicGateway WWW.XXX.YYY.ZZZ > > > > racadm config -g cfgNicUseDhcp 0 > > > > > > > > > > > > I also save a backup of the configuration with: > > > > > > > > racadm getconfig -f ~/drac_config > > > > > > > > > > > > Hope this helps, > > > > > > > > Mark > > > > > > > > ---- > > > > Mark Bergman voice: 215-662-7310 > > > > mark.bergman@xxxxxxxxxxxxxx fax: 215-614-0266 > > > > System Administrator Section of Biomedical Image Analysis > > > > Department of Radiology University of Pennsylvania > > > > PGP Key: https://www.rad.upenn.edu/sbia/bergman > > > > > > > > > > > > => Greetings, > > > > => Juanra > > > > => > > > > => > > > > > => > ESG > > > > => > > > > > => > 2009/8/10 Paras pradhan <pradhanparas@xxxxxxxxx> > > > > => > > > > > => > On Mon, Aug 10, 2009 at 5:24 AM, ESGLinux<esggrupos@xxxxxxxxx> > > wrote: > > > > => >> > Hi all, > > > > => >> > I was designing a 2 node cluster and I was going to use 2 > > servers > > > > DELL > > > > => >> > PowerEdge 1950. I was going to buy a DRAC card to use for > > fencing > > > > but > > > > => >> > running several commands in the servers I have noticed that > > when I > > > > run > > > > => >> this > > > > => >> > command: > > > > => >> > #ipmitool lan print > > > > => >> > Set in Progress : Set Complete > > > > => >> > Auth Type Support : NONE MD2 MD5 PASSWORD > > > > => >> > Auth Type Enable : Callback : MD2 MD5 > > > > => >> > : User : MD2 MD5 > > > > => >> > : Operator : MD2 MD5 > > > > => >> > : Admin : MD2 MD5 > > > > => >> > : OEM : MD2 MD5 > > > > => >> > IP Address Source : Static Address > > > > => >> > IP Address : 0.0.0.0 > > > > => >> > Subnet Mask : 0.0.0.0 > > > > => >> > MAC Address : 00:1e:c9:ae:6f:7e > > > > => >> > SNMP Community String : public > > > > => >> > IP Header : TTL=0x40 Flags=0x40 Precedence=0x00 TOS=0x10 > > > > => >> > Default Gateway IP : 0.0.0.0 > > > > => >> > Default Gateway MAC : 00:00:00:00:00:00 > > > > => >> > Backup Gateway IP : 0.0.0.0 > > > > => >> > Backup Gateway MAC : 00:00:00:00:00:00 > > > > => >> > 802.1q VLAN ID : Disabled > > > > => >> > 802.1q VLAN Priority : 0 > > > > => >> > RMCP+ Cipher Suites : 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14 > > > > => >> > Cipher Suite Priv Max : aaaaaaaaaaaaaaa > > > > => >> > : X=Cipher Suite Unused > > > > => >> > : c=CALLBACK > > > > => >> > : u=USER > > > > => >> > : o=OPERATOR > > > > => >> > : a=ADMIN > > > > => >> > : O=OEM > > > > => >> > does this mean that I already have an ipmi card (not > > configured) > > > > that > > > > => I > > > > => >> can > > > > => >> > use for fencing? if the anwser is yes, where hell must I > > configure > > > > it? > > > > => I > > > > => >> > don=B4t see wher can I do it. > > > > => >> > If I haven=B4t a fencing device which one do you recommed to > > use? > > > > => >> > Thanks in advance > > > > => >> > ESG > > > > => >> > > > > > => >> > -- > > > > => >> > Linux-cluster mailing list > > > > => >> > Linux-cluster@xxxxxxxxxx > > > > => >> > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > => >> > > > > > => >> > > > > => >> Yes you have IPMI and if you are using 1950 Dell, DRAC should be > > > > there > > > > => >> too. You can see if you have DRAC or not when the server starts > > and > > > > => >> before the loading of the OS. > > > > => >> > > > > => >> I have 1850s and I am using DRAC for fencing. > > > > => >> > > > > => >> > > > > => >> Paras. > > > > => >> > > > > => >> -- > > > > => >> Linux-cluster mailing list > > > > => >> Linux-cluster@xxxxxxxxxx > > > > => >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > => >> > > > > => > > > > > => > > > > > > > > > > > > > > > > > -- > > > > Linux-cluster mailing list > > > > Linux-cluster@xxxxxxxxxx > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > -------------- next part -------------- > > > An HTML attachment was scrubbed... > > > URL: > > https://www.redhat.com/archives/linux-cluster/attachments/20090813/a4558d27/attachment.html > > > > > > ------------------------------ > > > > > > Message: 2 > > > Date: Thu, 13 Aug 2009 14:45:13 +0200 > > > From: mark benschop <mark.benschop.lists@xxxxxxxxx> > > > Subject: clusterservice stays in 'recovering' state > > > To: linux-cluster@xxxxxxxxxx > > > Message-ID: > > > <f97c3a70908130545n11ce442ej17d74c9cdc450e45@xxxxxxxxxxxxxx> > > > Content-Type: text/plain; charset="iso-8859-1" > > > > > > Hi All, > > > > > > I've a problem with a clusterservice. The service was started up while > > one > > > of the resources, an NFS, export was not accessible. > > > Therefore the service never started up right but got into the > > 'recovering' > > > state. > > > In the mean time the NFS exports are setup properly but to no avail. > > > Stopping the clusterservice, using clusvcadm -d <service>, will result in > > > the service going down but staying in the 'recovering' state. > > > Starting it again doesn't work. The service doesn't start and stays in > > the > > > recovery status. > > > I'm suspecting rgmanager lost it somehow. > > > > > > Anybody had any ideas on what could be the problem and how to resolve it > > ? > > > > > > Thanks in advance, > > > Mark > > > -------------- next part -------------- > > > An HTML attachment was scrubbed... > > > URL: > > https://www.redhat.com/archives/linux-cluster/attachments/20090813/31731cd3/attachment.html > > > > > > ------------------------------ > > > > > > Message: 3 > > > Date: Thu, 13 Aug 2009 16:13:12 +0300 > > > From: Hakan VELIOGLU <veliogluh@xxxxxxxxxx> > > > Subject: Re: Is there any backup heartbeat channel > > > To: linux-cluster@xxxxxxxxxx > > > Message-ID: <20090813161312.11546h2sp6psr814@xxxxxxxxxxxxxxxxxx> > > > Content-Type: text/plain; charset=ISO-8859-9; DelSp="Yes"; > > > format="flowed" > > > > > > Thanks for all the answers. > > > > > > I think there is realy no backup heartbeat channel. Maybe the reason > > > is GFS. DLM works on the heartbeat channel. If you lost your heartbeat > > > you lose your lock consistency so it is better to fence the other > > > node. For this reason I think if you don't have enough network > > > interface on server and switch, loosing the heartbeat network may shut > > > all the cluster members. > > > > > > Hakan VELÝOÐLU > > > > > > > > > ----- robejrm@xxxxxxxxx den ileti --------- > > > Tarih: Thu, 13 Aug 2009 10:42:11 +0200 > > > Kimden: Juan Ramon Martin Blanco <robejrm@xxxxxxxxx> > > > Yanýt Adresi:linux clustering <linux-cluster@xxxxxxxxxx> > > > Konu: Re: Is there any backup heartbeat channel > > > Kime: linux clustering <linux-cluster@xxxxxxxxxx> > > > > > > > > > > 2009/8/13 Hakan VELIOGLU <veliogluh@xxxxxxxxxx> > > > > > > > >> ----- raju.rajsand@xxxxxxxxx den ileti --------- > > > >> Tarih: Thu, 13 Aug 2009 08:57:15 +0530 > > > >> Kimden: Rajagopal Swaminathan <raju.rajsand@xxxxxxxxx> > > > >> Yanýt Adresi:linux clustering <linux-cluster@xxxxxxxxxx> > > > >> Konu: Re: [Linux-cluster] Is there any backup heartbeat channel > > > >> Kime: linux clustering <linux-cluster@xxxxxxxxxx> > > > >> > > > >> > > > >> Greetings, > > > >>> > > > >>> 2009/8/12 Hakan VELIOGLU <veliogluh@xxxxxxxxxx>: > > > >>> > > > >>>> Hi list, > > > >>>> > > > >>>> I am trying a two node cluster with RH 5.3 on Sun X4150 hardware. I > > use a > > > >>>> > > > >>> > > > >>> IIRC, Sun x4150 has four ethernet ports. Two can be used for outside > > > >>> networking and two can be bonded and used for heartbeat. > > > >>> > > > >> I think, I couldn't explain my networking. I use two ethernet ports > > for xen > > > >> vm which are trunk and bonded ports. Then there left two. Our network > > > >> topology (which is out of my control) available for one port for > > server > > > >> control (SSH). > > > > > > > > So you can't use a bonded port for both server management and cluster > > > > communications, can you? You can configure an active-passive bonding > > and > > > > then you can have many virtual interfaces on top of that, i.e: bond0:0, > > > > bond0:1 and assign them the ip addesses you need. > > > > > > > > > > > > I use the other one with a cross over cable for heartbeat. So there is > > no > > > >> way for bonding these two interfaces. Of course if I buy an extra > > switch I > > > >> may do this. > > > > > > > > You can connect them to the same switch (though you lost kind of > > > > redundancy), or you can use two crossover cables and move the > > management IP > > > > to the same ports you are using for the vm's. > > > > > > > > Greetings, > > > > Juanra > > > > > > > >> > > > >> I don't realy understand why there is no backup heartbeat channel. LVS > > and > > > >> MS cluster has this ability. > > > >> > > > >>> > > > >>> ALOM can be used for fencing and can be on a seperate subnet if > > required. > > > >>> > > > >> I used this for fencing_ipmilan. > > > >> > > > >>> > > > >>> Regards > > > >>> > > > >>> Rajagopal > > > >>> > > > >>> -- > > > >>> Linux-cluster mailing list > > > >>> Linux-cluster@xxxxxxxxxx > > > >>> https://www.redhat.com/mailman/listinfo/linux-cluster > > > >>> > > > >>> > > > >> > > > >> ----- raju.rajsand@xxxxxxxxx den iletiyi bitir ----- > > > >> > > > >> > > > >> > > > >> > > > >> -- > > > >> Linux-cluster mailing list > > > >> Linux-cluster@xxxxxxxxxx > > > >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > >> > > > > > > > > > > > > > ----- robejrm@xxxxxxxxx den iletiyi bitir ----- > > > > > > > > > > > > > > > > > > ------------------------------ > > > > > > Message: 4 > > > Date: Thu, 13 Aug 2009 15:29:43 +0200 > > > From: Juan Ramon Martin Blanco <robejrm@xxxxxxxxx> > > > Subject: Re: [Linux-cluster] Is there any backup heartbeat channel > > > To: linux clustering <linux-cluster@xxxxxxxxxx> > > > Message-ID: > > > <8a5668960908130629n6ec05a88n463a3b03da331dae@xxxxxxxxxxxxxx> > > > Content-Type: text/plain; charset="iso-8859-9" > > > > > > 2009/8/13 Hakan VELIOGLU <veliogluh@xxxxxxxxxx> > > > > > > > Thanks for all the answers. > > > > > > > > I think there is realy no backup heartbeat channel. Maybe the reason is > > > > GFS. DLM works on the heartbeat channel. If you lost your heartbeat you > > lose > > > > your lock consistency so it is better to fence the other node. For this > > > > reason I think if you don't have enough network interface on server and > > > > switch, loosing the heartbeat network may shut all the cluster members. > > > > > > > There is no backup heartbeat channel because you should do the backup at > > a > > > operating system level, i.e: bonding > > > That's why you should use a bonded interface for the heartbeat channel > > with > > > at least 2 ethernet slaves; going further (for better redundancy) each of > > > the slaves should be on a different network card and you should connect > > the > > > each slave to a different switch. > > > But what I am trying to explain, is that you can use that bonded logical > > > interface also for things different from hearbeat. ;) > > > > > > Greetings, > > > Juanra > > > > > > > > > > Hakan VELÝOÐLU > > > > > > > > > > > > ----- robejrm@xxxxxxxxx den ileti --------- > > > > Tarih: Thu, 13 Aug 2009 10:42:11 +0200 > > > > Kimden: Juan Ramon Martin Blanco <robejrm@xxxxxxxxx> > > > > > > > > Yanýt Adresi:linux clustering <linux-cluster@xxxxxxxxxx> > > > > Konu: Re: [Linux-cluster] Is there any backup heartbeat channel > > > > Kime: linux clustering <linux-cluster@xxxxxxxxxx> > > > > > > > > > > > > 2009/8/13 Hakan VELIOGLU <veliogluh@xxxxxxxxxx> > > > >> > > > >> ----- raju.rajsand@xxxxxxxxx den ileti --------- > > > >>> Tarih: Thu, 13 Aug 2009 08:57:15 +0530 > > > >>> Kimden: Rajagopal Swaminathan <raju.rajsand@xxxxxxxxx> > > > >>> Yanýt Adresi:linux clustering <linux-cluster@xxxxxxxxxx> > > > >>> Konu: Re: Is there any backup heartbeat channel > > > >>> Kime: linux clustering <linux-cluster@xxxxxxxxxx> > > > >>> > > > >>> > > > >>> Greetings, > > > >>> > > > >>>> > > > >>>> 2009/8/12 Hakan VELIOGLU <veliogluh@xxxxxxxxxx>: > > > >>>> > > > >>>> Hi list, > > > >>>>> > > > >>>>> I am trying a two node cluster with RH 5.3 on Sun X4150 hardware. I > > use > > > >>>>> a > > > >>>>> > > > >>>>> > > > >>>> IIRC, Sun x4150 has four ethernet ports. Two can be used for outside > > > >>>> networking and two can be bonded and used for heartbeat. > > > >>>> > > > >>>> I think, I couldn't explain my networking. I use two ethernet ports > > for > > > >>> xen > > > >>> vm which are trunk and bonded ports. Then there left two. Our network > > > >>> topology (which is out of my control) available for one port for > > server > > > >>> control (SSH). > > > >>> > > > >> > > > >> So you can't use a bonded port for both server management and cluster > > > >> communications, can you? You can configure an active-passive bonding > > and > > > >> then you can have many virtual interfaces on top of that, i.e: > > bond0:0, > > > >> bond0:1 and assign them the ip addesses you need. > > > >> > > > >> > > > >> I use the other one with a cross over cable for heartbeat. So there is > > no > > > >> > > > >>> way for bonding these two interfaces. Of course if I buy an extra > > switch > > > >>> I > > > >>> may do this. > > > >>> > > > >> > > > >> You can connect them to the same switch (though you lost kind of > > > >> redundancy), or you can use two crossover cables and move the > > management > > > >> IP > > > >> to the same ports you are using for the vm's. > > > >> > > > >> Greetings, > > > >> Juanra > > > >> > > > >> > > > >>> I don't realy understand why there is no backup heartbeat channel. > > LVS > > > >>> and > > > >>> MS cluster has this ability. > > > >>> > > > >>> > > > >>>> ALOM can be used for fencing and can be on a seperate subnet if > > > >>>> required. > > > >>>> > > > >>>> I used this for fencing_ipmilan. > > > >>> > > > >>> > > > >>>> Regards > > > >>>> > > > >>>> Rajagopal > > > >>>> > > > >>>> -- > > > >>>> Linux-cluster mailing list > > > >>>> Linux-cluster@xxxxxxxxxx > > > >>>> https://www.redhat.com/mailman/listinfo/linux-cluster > > > >>>> > > > >>>> > > > >>>> > > > >>> ----- raju.rajsand@xxxxxxxxx den iletiyi bitir ----- > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> -- > > > >>> Linux-cluster mailing list > > > >>> Linux-cluster@xxxxxxxxxx > > > >>> https://www.redhat.com/mailman/listinfo/linux-cluster > > > >>> > > > >>> > > > >> > > > > > > > > ----- robejrm@xxxxxxxxx den iletiyi bitir ----- > > > > > > > > > > > > > > > > > > > > -- > > > > Linux-cluster mailing list > > > > Linux-cluster@xxxxxxxxxx > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > -------------- next part -------------- > > > An HTML attachment was scrubbed... > > > URL: > > https://www.redhat.com/archives/linux-cluster/attachments/20090813/cd4dc079/attachment.html > > > > > > ------------------------------ > > > > > > Message: 5 > > > Date: Thu, 13 Aug 2009 10:07:13 -0400 > > > From: Nehemias Jahcob <nehemiasjahcob@xxxxxxxxx> > > > Subject: RHCS on KVM > > > To: linux clustering <linux-cluster@xxxxxxxxxx> > > > Message-ID: > > > <5f61ab380908130707q5c936504k7351d0d6b3459090@xxxxxxxxxxxxxx> > > > Content-Type: text/plain; charset="iso-8859-1" > > > > > > Hi. > > > > > > How to create a cluster of 2 nodes in rhel5.4 (or Fedora 10) with KVM? > > > > > > With XEN follow this guide: > > > http://sources.redhat.com/cluster/wiki/VMClusterCookbook?highlight = > > > (CategoryHowTo). > > > > > > Do you have a guide to implementation of RHCS in KVM? > > > > > > Thank you all. > > > NJ > > > -------------- next part -------------- > > > An HTML attachment was scrubbed... > > > URL: > > https://www.redhat.com/archives/linux-cluster/attachments/20090813/f3a69a80/attachment.html > > > > > > ------------------------------ > > > > > > Message: 6 > > > Date: Thu, 13 Aug 2009 14:16:47 +0000 > > > From: NTOUGHE GUY-SERGE <ntoughe@xxxxxxxxxxx> > > > Subject: Cman hang > > > To: <linux-cluster@xxxxxxxxxx> > > > Message-ID: <BAY119-W410E2F250E8B461752CFC9A5050@xxxxxxx> > > > Content-Type: text/plain; charset="iso-8859-1" > > > > > > > > > > > > Hi gurus, > > > > > > i installed RHEL 5.3 on 2 servers which participating to a cluster > > composed of these 2 nodes: > > > kernel version: > > > kernel-headers-2.6.18-128.el5 > > > kernel-devel-2.6.18-128.el5 > > > kernel-2.6.18-128.el5 > > > cman-devel-2.0.98-1.el5_3.1 > > > cman-2.0.98-1.el5_3.1 > > > cluster-cim-0.12.1-2.el5 > > > lvm2-cluster-2.02.40-7.el5 > > > cluster-snmp-0.12.1-2.el5 > > > modcluster-0.12.1-2.el5 > > > When i want to start cman the following message is sent: > > > cman not started: Multicast and node address families differ. > > /usr/sbin/cman_tool: aisexec daemon didn't start > > > [FAILED] > > > > > > I trier to mount gfs2 > > > and i got theses messages: > > > # mount -t gfs2 /dev/VolGroup01/LogVol01 /appli/prod --o > > lockTablename=arvclust:/appli/prod, Lockproto=lock_dlm > > > > > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused > > > > > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused > > > > > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused > > > > > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused > > > > > > do you have any clues? > > > Please it's an hurry, i waste long time to lok for solution help > > > regards > > > > > > > > > > > > > > > > > > > > > > > > > > > ntoughe@xxxxxxxxxxx > > > > > > > > > _________________________________________________________________ > > > With Windows Live, you can organize, edit, and share your photos. > > > > > http://www.microsoft.com/middleeast/windows/windowslive/products/photo-gallery-edit.aspx > > > -------------- next part -------------- > > > An HTML attachment was scrubbed... > > > URL: > > https://www.redhat.com/archives/linux-cluster/attachments/20090813/0a55101d/attachment.html > > > > > > ------------------------------ > > > > > > Message: 7 > > > Date: Thu, 13 Aug 2009 09:14:24 -0500 > > > From: David Teigland <teigland@xxxxxxxxxx> > > > Subject: Re: gfs2 mount hangs > > > To: Wengang Wang <wen.gang.wang@xxxxxxxxxx> > > > Cc: linux clustering <linux-cluster@xxxxxxxxxx> > > > Message-ID: <20090813141424.GA8148@xxxxxxxxxx> > > > Content-Type: text/plain; charset=us-ascii > > > > > > On Thu, Aug 13, 2009 at 02:22:11PM +0800, Wengang Wang wrote: > > > > <cman two_node="1" expected_votes="2"/> > > > > > > That's not a valid combination, two_node="1" requires expected_votes="1". > > > > > > You didn't mention which userspace cluster version/release you're using, > > or > > > include any status about the cluster. Before trying to mount gfs on > > either > > > node, collect from both nodes: > > > > > > cman_tool status > > > cman_tool nodes > > > group_tool > > > > > > Then mount on the first node and collect the same information, then try > > > mounting on the second node, collect the same information, and look for > > any > > > errors in /var/log/messages. > > > > > > Since you're using new kernels, you need to be using the cluster 3.0 > > userspace > > > code. You're using the old manual fencing config. There is no more > > > fence_manual; the new way to configure manual fencing is to not configure > > any > > > fencing at all. So, your cluster.conf should look like this: > > > > > > <?xml version="1.0"?> > > > <cluster name="testgfs2" config_version="1"> > > > <cman two_node="1" expected_votes="1"/> > > > <clusternodes> > > > <clusternode name="cool" nodeid="1"/> > > > <clusternode name="desk" nodeid="2"/> > > > </clusternodes> > > > </cluster> > > > > > > Dave > > > > > > > > > > > > ------------------------------ > > > > > > Message: 8 > > > Date: Thu, 13 Aug 2009 10:39:46 -0400 > > > From: Lon Hohberger <lhh@xxxxxxxxxx> > > > Subject: Re: Qdisk question > > > To: linux clustering <linux-cluster@xxxxxxxxxx> > > > Message-ID: <1250174386.23376.1440.camel@xxxxxxxxxxxxxxxxxxxxx> > > > Content-Type: text/plain > > > > > > On Thu, 2009-08-13 at 00:45 +0200, brem belguebli wrote: > > > > > > > My understanding of qdisk is that it is used as a tie-breaker, but it > > > > looks like it is more a heatbeat vector than a simple tie-breaker. > > > > > > Right, it's a secondary membership algorithm. > > > > > > > > > > Until here, no real problem indeed, if the site gets apart from the > > > > other prod site and also from the third site (hosting the iscsi target > > > > qdisk) the 2 nodes from the failing site get evicted from the cluster. > > > > > > > > > > > > But, what if my third site gets isolated while the 2 prod ones are > > > > fine ? > > > > > > Qdisk votes will not be presented to CMAN any more, but the two sites > > > should remain online if they still have a "majority" of votes. > > > > > > > > > > The real question is what happens in case all the nodes loose access > > > > to the qdisk while they're still able to see each others ? > > > > > > Qdisk is just a vote like other voting mechanisms. If all nodes lose > > > access at the same time, it should behave like a node death. However, > > > the default action if _one_ node loses access is to kill that node (even > > > if CMAN still sees it). > > > > > > > > > > The 4 nodes have each 1 vote and the qdisk 1 vote. The expected quorum > > > > is 3. > > > > > > > > > > If I loose the qdisk, the number of votes falls to 4, the cluster is > > > > quorate (4>3) but it looks like everything goes bad, each node > > > > deactivate itself as it can't write its alive status (--> heartbeat > > > > vector) to the qdisk even if the network heartbeating is working > > > > fine. > > > > > > What happens specifically? Most of the actions qdiskd performs are > > > configurable. For example, if the nodes are rebooting, you can turn > > > that behavior off. > > > > > > > > > > > > I wrote a simple 'ping' tiebreaker based the behaviors in RHEL3. It > > > functions in many ways in the same manner as qdiskd with respect to vote > > > advertisement to CMAN, but without needing a disk - maybe you would find > > > it useful? > > > > > > http://people.redhat.com/lhh/qnet.tar.gz > > > > > > -- Lon > > > > > > > > > > > > ------------------------------ > > > > > > Message: 9 > > > Date: Thu, 13 Aug 2009 17:02:15 +0200 > > > From: Juan Ramon Martin Blanco <robejrm@xxxxxxxxx> > > > Subject: Re: Cman hang > > > To: linux clustering <linux-cluster@xxxxxxxxxx> > > > Message-ID: > > > <8a5668960908130802p4f5168cbueda86d1e6f1324bb@xxxxxxxxxxxxxx> > > > Content-Type: text/plain; charset="iso-8859-1" > > > > > > On Thu, Aug 13, 2009 at 4:16 PM, NTOUGHE GUY-SERGE <ntoughe@xxxxxxxxxxx > > >wrote: > > > > > > > > > > > Hi gurus, > > > > > > > > i installed RHEL 5.3 on 2 servers which participating to a cluster > > > > composed of these 2 nodes: > > > > kernel version: > > > > kernel-headers-2.6.18-128.el5 > > > > kernel-devel-2.6.18-128.el5 > > > > kernel-2.6.18-128.el5 > > > > cman-devel-2.0.98-1.el5_3.1 > > > > cman-2.0.98-1.el5_3.1 > > > > cluster-cim-0.12.1-2.el5 > > > > lvm2-cluster-2.02.40-7.el5 > > > > cluster-snmp-0.12.1-2.el5 > > > > modcluster-0.12.1-2.el5 > > > > When i want to start cman the following message is sent: > > > > cman not started: Multicast and node address families differ. > > > > /usr/sbin/cman_tool: aisexec daemon didn't start > > > > [FAILED] > > > > > > > Please, show us your cluster.conf file so we can help. > > > > > > Regards, > > > Juanra > > > > > > > > > > > I trier to mount gfs2 > > > > and i got theses messages: > > > > # mount -t gfs2 /dev/VolGroup01/LogVol01 /appli/prod --o > > > > lockTablename=arvclust:/appli/prod, Lockproto=lock_dlm > > > > > > > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused > > > > > > > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused > > > > > > > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused > > > > > > > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused > > > > > > > > do you have any clues? > > > > Please it's an hurry, i waste long time to lok for solution help > > > > regards > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ntoughe@xxxxxxxxxxx > > > > > > > > > > > > > > > > ------------------------------ > > > > With Windows Live, you can organize, edit, and share your photos.< > > http://www.microsoft.com/middleeast/windows/windowslive/products/photo-gallery-edit.aspx > > > > > > > > > > > -- > > > > Linux-cluster mailing list > > > > Linux-cluster@xxxxxxxxxx > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > -------------- next part -------------- > > > An HTML attachment was scrubbed... > > > URL: > > https://www.redhat.com/archives/linux-cluster/attachments/20090813/9ecbcab1/attachment.html > > > > > > ------------------------------ > > > > > > -- > > > Linux-cluster mailing list > > > Linux-cluster@xxxxxxxxxx > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > End of Linux-cluster Digest, Vol 64, Issue 16 > > > ********************************************* > > > > ------------------------------ > > See all the ways you can stay connected to friends and family<http://www.microsoft.com/windows/windowslive/default.aspx> > > > > -- > > Linux-cluster mailing list > > Linux-cluster@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: https://www.redhat.com/archives/linux-cluster/attachments/20090813/f9557411/attachment.html > > ------------------------------ > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > > End of Linux-cluster Digest, Vol 64, Issue 18 > ********************************************* Share your memories online with anyone you want anyone you want. |
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster