On Fri, Jan 22, 2010 at 9:00 AM, King, Adam <adam.king@xxxxxxxxxxxxxxxx> wrote: > I'm assuming you have read this? http://sources.redhat.com/cluster/wiki/FAQ/CMAN#cman_2to3 > > > > > Adam King > Systems Administrator > adam.king@xxxxxxxxxxxxxxxx > > > InTechnology plc > Support 0845 120 7070 > Telephone 01423 850000 > Facsimile 01423 858866 > www.intechnology.com > > > -----Original Message----- > > From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Terry > Sent: 22 January 2010 14:45 > To: linux clustering > Subject: Re: cannot add 3rd node to running cluster > > On Mon, Jan 4, 2010 at 1:34 PM, Abraham Alawi <a.alawi@xxxxxxxxxxxxxx> wrote: >> >> On 1/01/2010, at 5:13 AM, Terry wrote: >> >>> On Wed, Dec 30, 2009 at 10:13 AM, Terry <td3201@xxxxxxxxx> wrote: >>>> On Tue, Dec 29, 2009 at 5:20 PM, Jason W. <jwellband@xxxxxxxxx> wrote: >>>>> On Tue, Dec 29, 2009 at 2:30 PM, Terry <td3201@xxxxxxxxx> wrote: >>>>>> Hello, >>>>>> >>>>>> I have a working 2 node cluster that I am trying to add a third node >>>>>> to. I am trying to use Red Hat's conga (luci) to add the node in but >>>>> >>>>> If you have two node cluster with two_node=1 in cluster.conf - such as >>>>> two nodes with no quorum device to break a tie - you'll need to bring >>>>> the cluster down, change two_node to 0 on both nodes (and rev the >>>>> cluster version at the top of cluster.conf), bring the cluster up and >>>>> then add the third node. >>>>> >>>>> For troubleshooting any cluster issue, take a look at syslog >>>>> (/var/log/messages by default). It can help to watch it on a >>>>> centralized syslog server that all of your nodes forward logs to. >>>>> >>>>> -- >>>>> HTH, YMMV, HANW :) >>>>> >>>>> Jason >>>>> >>>>> The path to enlightenment is /usr/bin/enlightenment. >>>> >>>> Thank you for the response. /var/log/messages doesn't have any >>>> errors. It says cman started then says can't connect to cluster >>>> infrastructure after a few seconds. My cluster does not have the >>>> two_node=1 config now. Conga took that out for me. That bit me last >>>> night because I needed to put that back in. >>>> >>> >>> CMAN still will not start and gives no debug information. Anyone know >>> why cman_tool -d join would not print any output at all? >>> Troubleshooting this is kind of a nightmare. I verified that two_node >>> is not in play. >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster@xxxxxxxxxx >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> Try this line in your cluster.conf file: >> <logging debug="on" logfile="/var/log/rhcs.log" to_file="yes"/> >> >> Also, if you are sure your cluster.conf is correct then copy it manually to all the nodes and add clean_start="1" to the fence_daemon line in cluster.conf and run 'service cman start' simultaneously on all the nodes (probably a good idea to do that from runlevel 1 but make sure you have the network up first) >> >> Cheers, >> >> -- Abraham >> >> '''''''''''''''''''''''''''''''''''''''''''''''''''''' >> Abraham Alawi >> >> Unix/Linux Systems Administrator >> Science IT >> University of Auckland >> e: a.alawi@xxxxxxxxxxxxxx >> p: +64-9-373 7599, ext#: 87572 >> >> '''''''''''''''''''''''''''''''''''''''''''''''''''''' >> >> > > I am still battling this. I stopped the cluster completely, modified > the config and then started it, but that didn't work either. Same > issue. I noticed clurgmgrd wasn't staying running so I then tried > this: > > [root@omadvnfs01c ~]# clurgmgrd -d -f > [7014] notice: Waiting for CMAN to start > > Then in another window I issued: > [root@omadvnfs01c ~]# cman_tool join > > > Then back in the other window below "[7014] notice: Waiting for CMAN > to start", I got: > failed acquiring lockspace: Transport endpoint is not connected > Locks not working! > > Anyone know what could be going on? > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > This is an email from InTechnology plc, Central House, Beckwith Knowle, Harrogate, UK, HG3 1UG. > Registered in England 3916586. > > The contents of this message may be privileged and confidential. If you have received this message in error, you may not use, > > disclose, copy or distribute its content in any way. Please notify the sender immediately. All messages are scanned for all viruses. > > -- I didn't but I performed those steps anyways. As it sits, I have a three node cluster with only two nodes in it. Which is bad too but it is what it is until I figure this out. Here's my cluster.conf just for completeness: <cluster alias="omadvnfs01" config_version="53" name="omadvnfs01"> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="omadvnfs01a.sec.jel.lc" nodeid="1" votes="1"> <fence> <method name="1"> <device name="omadvnfs01a-drac"/> </method> </fence> </clusternode> <clusternode name="omadvnfs01b.sec.jel.lc" nodeid="2" votes="1"> <fence> <method name="1"> <device name="omadvnfs01b-drac"/> </method> </fence> </clusternode> <clusternode name="omadvnfs01c.sec.jel.lc" nodeid="3" votes="1"> <fence> <method name="1"> <device name="omadvnfs01c-drac"/> </method> </fence> </clusternode> </clusternodes> <cman/> <fencedevices> <fencedevice agent="fence_drac" ipaddr="10.98.1.211" login="root" name="omadvnfs01a-drac" passwd="foo"/> <fencedevice agent="fence_drac" ipaddr="10.98.1.212" login="root" name="omadvnfs01b-drac" passwd="foo"/> <fencedevice agent="fence_drac" ipaddr="10.98.1.213" login="root" name="omadvnfs01c-drac" passwd="foo"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="fd_omadvnfs01a-nfs" nofailback="1" ordered="1" restricted="0"> <failoverdomainnode name="omadvnfs01a.sec.jel.lc" priority="1"/> </failoverdomain> <failoverdomain name="fd_omadvnfs01b-nfs" nofailback="1" ordered="1" restricted="0"> <failoverdomainnode name="omadvnfs01b.sec.jel.lc" priority="2"/> </failoverdomain> <failoverdomain name="fd_omadvnfs01c-nfs" nofailback="1" ordered="1" restricted="0"> <failoverdomainnode name="omadvnfs01c.sec.jel.lc" priority="1"/> </failoverdomain> </failoverdomains> I am not sure if I did a restart after I did the work though. When it says "shutdown cluster software" that is simply a 'service cman stop' on redhat, right? Want to make sure I don't need to kill any other components before updating the configuration manually. I appreciate the help. I am probably going to try it again this afternoon to double check my work. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster