Hi, check /etc/sysconfig/cman maybe there is a different name present as NODENAME ... remove the file (if present) or try to create one as: #CMAN_CLUSTER_TIMEOUT=120 #CMAN_QUORUM_TIMEOUT=0 #CMAN_SHUTDOWN_TIMEOUT=60 FENCED_START_TIMEOUT=120 ##FENCE_JOIN=no #LOCK_FILE="/var/lock/subsys/cman" CLUSTERNAME=ClusterName NODENAME=NodeName On Sun, 08 Jan 2012 20:03:18 -0800, Wes Modes <wmodes@xxxxxxxx> wrote: > The behavior of cman's resolving of cluster node names is less than > clear, as per the RHEL bugzilla report. > > The hostname and cluster.conf match, as does /etc/hosts and uname -n. > The short names and FQDN ping. I believe all the node cluster.conf are > in sync, and all nodes are accessible to each other using either short > or long names. > > You'll have to trust that I've tried everything obvious, and every > possible combination of FQDN and short names in cluster.conf and > hostname. That said, it is totally possible I missed something obvious. > > I suspect, there is something else going on and I don't know how to get > at it. > > Wes > > > On 1/6/2012 6:06 PM, Kevin Stanton wrote: >> >> > Hi, >> >> > I think CMAN expect that the names of the cluster nodes be the same >> returned by the command "uname -n". >> >> > For what you write your nodes hostnames are: test01.gdao.ucsc.edu >> and test02.gdao.ucsc.edu, but in cluster.conf you have declared only >> "test01" and "test02". >> >> >> >> I haven't found this to be the case in the past. I actually use a >> separate short name to reference each node which is different than the >> hostname of the server itself. All I've ever had to do is make sure >> it resolves correctly. You can do this either in DNS and/or in >> /etc/hosts. I have found that it's a good idea to do both in case >> your DNS server is a virtual machine and is not running for some >> reason. In that case with /etc/hosts you can still start cman. >> >> >> >> I would make sure whatever node names you use in the cluster.conf will >> resolve when you try to ping it from all nodes in the cluster. Also >> make sure your cluster.conf is in sync between all nodes. >> >> >> >> -Kevin >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> These servers are currently on the same host, but may not be in >> the future. They are in a vm cluster (though honestly, I'm not >> sure what this means yet). >> >> SElinux is on, but disabled. >> Firewalling through iptables is turned off via >> system-config-securitylevel >> >> There is no line currently in the cluster.conf that deals with >> multicasting. >> >> Any other suggestions? >> >> Wes >> >> On 1/6/2012 12:05 PM, Luiz Gustavo Tonello wrote: >> >> Hi, >> >> >> >> This servers is on VMware? At the same host? >> >> SElinux is disable? iptables have something? >> >> >> >> In my environment I had a problem to start GFS2 with servers in >> differents hosts. >> >> To clustering servers, was need migrate one server to the same >> host of the other, and restart this. >> >> >> >> I think, one of the problem was because the virtual switchs. >> >> To solve, I changed a multicast IP, to use 225.0.0.13 at >> cluster.conf >> >> <multicast addr="225.0.0.13"/> >> >> And add a static route in both, to use default gateway. >> >> >> >> I don't know if it's correct, but this solve my problem. >> >> >> >> I hope that help you. >> >> >> >> Regards. >> >> >> >> On Fri, Jan 6, 2012 at 5:01 PM, Wes Modes <wmodes@xxxxxxxx >> <mailto:wmodes@xxxxxxxx>> wrote: >> >> Hi, Steven. >> >> I've tried just about every possible combination of hostname and >> cluster.conf. >> >> ping to test01 resolves to 128.114.31.112 >> ping to test01.gdao.ucsc.edu <http://test01.gdao.ucsc.edu> >> resolves to 128.114.31.112 >> >> It feels like the right thing is being returned. This feels like it >> might be a quirk (or bug possibly) of cman or openais. >> >> There are some old bug reports around this, for example >> https://bugzilla.redhat.com/show_bug.cgi?id=488565. It sounds >> like the >> way that cman reports this error is anything but straightforward. >> >> Is there anyone who has encountered this error and found a solution? >> >> Wes >> >> >> >> On 1/6/2012 2:00 AM, Steven Whitehouse wrote: >> > Hi, >> > >> > On Thu, 2012-01-05 at 13:54 -0800, Wes Modes wrote: >> >> Howdy, y'all. I'm trying to set up GFS in a cluster on CentOS >> systems >> >> running on vmWare. The GFS FS is on a Dell Equilogic SAN. >> >> >> >> I keep running into the same problem despite many >> differently-flavored >> >> attempts to set up GFS. The problem comes when I try to start >> cman, the >> >> cluster management software. >> >> >> >> [root@test01]# service cman start >> >> Starting cluster: >> >> Loading modules... done >> >> Mounting configfs... done >> >> Starting ccsd... done >> >> Starting cman... failed >> >> cman not started: Can't find local node name in cluster.conf >> >> /usr/sbin/cman_tool: aisexec daemon didn't start >> >> >> [FAILED] >> >> >> > This looks like what it says... whatever the node name is in >> > cluster.conf, it doesn't exist when the name is looked up, or >> possibly >> > it does exist, but is mapped to the loopback address (it needs to >> map to >> > an address which is valid cluster-wide) >> > >> > Since your config files look correct, the next thing to check is >> > what >> > the resolver is actually returning. Try (for example) a ping to >> test01 >> > (you need to specify exactly the same form of the name as is used >> > in >> > cluster.conf) from test02 and see whether it uses the correct ip >> > address, just in case the wrong thing is being returned. >> > >> > Steve. >> > >> >> [root@test01]# tail /var/log/messages >> >> Jan 5 13:39:40 testbench06 ccsd[13194]: Unable to connect to >> >> cluster infrastructure after 1193640 seconds. >> >> Jan 5 13:40:10 testbench06 ccsd[13194]: Unable to connect to >> >> cluster infrastructure after 1193670 seconds. >> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS >> >> Executive >> >> Service RELEASE 'subrev 1887 version 0.80.6' >> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] Copyright >> >> (C) >> >> 2002-2006 MontaVista Software, Inc and contributors. >> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] Copyright >> >> (C) >> >> 2006 Red Hat, Inc. >> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS >> >> Executive >> >> Service: started and ready to provide service. >> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] local >> node name >> >> "test01.gdao.ucsc.edu <http://test01.gdao.ucsc.edu>" not found >> in cluster.conf >> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] Error >> reading CCS >> >> info, cannot start >> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] Error >> >> reading >> >> config from CCS >> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS >> >> Executive >> >> exiting (reason: could not read the main configuration file). >> >> >> >> Here are details of my configuration: >> >> >> >> [root@test01]# rpm -qa | grep cman >> >> cman-2.0.115-85.el5_7.2 >> >> >> >> [root@test01]# echo $HOSTNAME >> >> test01.gdao.ucsc.edu <http://test01.gdao.ucsc.edu> >> >> >> >> [root@test01]# hostname >> >> test01.gdao.ucsc.edu <http://test01.gdao.ucsc.edu> >> >> >> >> [root@test01]# cat /etc/hosts >> >> # Do not remove the following line, or various programs >> >> # that require network functionality will fail. >> >> 128.114.31.112 test01 test01.gdao test01.gdao.ucsc.edu >> <http://test01.gdao.ucsc.edu> >> >> 128.114.31.113 test02 test02.gdao test02.gdao.ucsc.edu >> <http://test02.gdao.ucsc.edu> >> >> 127.0.0.1 localhost.localdomain localhost >> >> ::1 localhost6.localdomain6 localhost6 >> >> >> >> [root@test01]# sestatus >> >> SELinux status: enabled >> >> SELinuxfs mount: /selinux >> >> Current mode: permissive >> >> Mode from config file: permissive >> >> Policy version: 21 >> >> Policy from config file: targeted >> >> >> >> [root@test01]# cat /etc/cluster/cluster.conf >> >> <?xml version="1.0"?> >> >> <cluster config_version="25" name="gdao_cluster"> >> >> <fence_daemon post_fail_delay="0" post_join_delay="120"/> >> >> <clusternodes> >> >> <clusternode name="test01" nodeid="1" votes="1"> >> >> <fence> >> >> <method name="single"> >> >> <device name="gfs_vmware"/> >> >> </method> >> >> </fence> >> >> </clusternode> >> >> <clusternode name="test02" nodeid="2" votes="1"> >> >> <fence> >> >> <method name="single"> >> >> <device name="gfs_vmware"/> >> >> </method> >> >> </fence> >> >> </clusternode> >> >> </clusternodes> >> >> <cman/> >> >> <fencedevices> >> >> <fencedevice agent="fence_manual" name="gfs1_ipmi"/> >> >> <fencedevice agent="fence_vmware" name="gfs_vmware" >> >> ipaddr="gdvcenter.ucsc.edu <http://gdvcenter.ucsc.edu>" >> login="root" passwd="1hateAmazon.com" >> >> vmlogin="root" vmpasswd="esxpass" >> >> >> port="/vmfs/volumes/49086551-c64fd83c-0401-001e0bcd6848/eagle1/gfs1.vmx"/> >> >> </fencedevices> >> >> <rm> >> >> <failoverdomains/> >> >> </rm> >> >> </cluster> >> >> >> >> I've seen much discussion of this problem, but no definitive >> solutions. >> >> Any help you can provide will be welcome. >> >> >> >> Wes Modes >> >> >> >> -- >> >> Linux-cluster mailing list >> >> Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx> >> >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > >> > -- >> > Linux-cluster mailing list >> > Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx> >> > https://www.redhat.com/mailman/listinfo/linux-cluster >> >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx> >> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> >> >> >> -- >> Luiz Gustavo P Tonello. >> >> >> >> -- >> >> Linux-cluster mailing list >> >> Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx> >> >> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx> >> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx >> https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster