On 1/6/2012 2:00 AM, Steven Whitehouse wrote:
> Hi,
>
> On Thu, 2012-01-05 at 13:54 -0800, Wes Modes wrote:
>> Howdy, y'all. I'm trying to set up GFS in a
cluster on CentOS systems
>> running on vmWare. The GFS FS is on a Dell
Equilogic SAN.
>>
>> I keep running into the same problem despite
many differently-flavored
>> attempts to set up GFS. The problem comes when
I try to start cman, the
>> cluster management software.
>>
>> [root@test01]# service cman start
>> Starting cluster:
>> Loading modules... done
>> Mounting configfs... done
>> Starting ccsd... done
>> Starting cman... failed
>> cman not started: Can't find local node
name in cluster.conf
>> /usr/sbin/cman_tool: aisexec daemon didn't
start
>>
[FAILED]
>>
> This looks like what it says... whatever the node
name is in
> cluster.conf, it doesn't exist when the name is
looked up, or possibly
> it does exist, but is mapped to the loopback
address (it needs to map to
> an address which is valid cluster-wide)
>
> Since your config files look correct, the next
thing to check is what
> the resolver is actually returning. Try (for
example) a ping to test01
> (you need to specify exactly the same form of the
name as is used in
> cluster.conf) from test02 and see whether it uses
the correct ip
> address, just in case the wrong thing is being
returned.
>
> Steve.
>
>> [root@test01]# tail /var/log/messages
>> Jan 5 13:39:40 testbench06 ccsd[13194]:
Unable to connect to
>> cluster infrastructure after 1193640 seconds.
>> Jan 5 13:40:10 testbench06 ccsd[13194]:
Unable to connect to
>> cluster infrastructure after 1193670 seconds.
>> Jan 5 13:40:24 testbench06 openais[3939]:
[MAIN ] AIS Executive
>> Service RELEASE 'subrev 1887 version 0.80.6'
>> Jan 5 13:40:24 testbench06 openais[3939]:
[MAIN ] Copyright (C)
>> 2002-2006 MontaVista Software, Inc and
contributors.
>> Jan 5 13:40:24 testbench06 openais[3939]:
[MAIN ] Copyright (C)
>> 2006 Red Hat, Inc.
>> Jan 5 13:40:24 testbench06 openais[3939]:
[MAIN ] AIS Executive
>> Service: started and ready to provide service.
>> Jan 5 13:40:24 testbench06 openais[3939]:
[MAIN ] local node name
>> "
test01.gdao.ucsc.edu"
not found in cluster.conf
>> Jan 5 13:40:24 testbench06 openais[3939]:
[MAIN ] Error reading CCS
>> info, cannot start
>> Jan 5 13:40:24 testbench06 openais[3939]:
[MAIN ] Error reading
>> config from CCS
>> Jan 5 13:40:24 testbench06 openais[3939]:
[MAIN ] AIS Executive
>> exiting (reason: could not read the main
configuration file).
>>
>> Here are details of my configuration:
>>
>> [root@test01]# rpm -qa | grep cman
>> cman-2.0.115-85.el5_7.2
>>
>> [root@test01]# echo $HOSTNAME
>>
test01.gdao.ucsc.edu
>>
>> [root@test01]# hostname
>>
test01.gdao.ucsc.edu
>>
>> [root@test01]# cat /etc/hosts
>> # Do not remove the following line, or
various programs
>> # that require network functionality will
fail.
>> 128.114.31.112 test01 test01.gdao
test01.gdao.ucsc.edu
>> 128.114.31.113 test02 test02.gdao
test02.gdao.ucsc.edu
>> 127.0.0.1
localhost.localdomain localhost
>> ::1 localhost6.localdomain6
localhost6
>>
>> [root@test01]# sestatus
>> SELinux status: enabled
>> SELinuxfs mount: /selinux
>> Current mode: permissive
>> Mode from config file: permissive
>> Policy version: 21
>> Policy from config file: targeted
>>
>> [root@test01]# cat
/etc/cluster/cluster.conf
>> <?xml version="1.0"?>
>> <cluster config_version="25"
name="gdao_cluster">
>> <fence_daemon post_fail_delay="0"
post_join_delay="120"/>
>> <clusternodes>
>> <clusternode name="test01"
nodeid="1" votes="1">
>> <fence>
>> <method
name="single">
>> <device
name="gfs_vmware"/>
>> </method>
>> </fence>
>> </clusternode>
>> <clusternode name="test02"
nodeid="2" votes="1">
>> <fence>
>> <method
name="single">
>> <device
name="gfs_vmware"/>
>> </method>
>> </fence>
>> </clusternode>
>> </clusternodes>
>> <cman/>
>> <fencedevices>
>> <fencedevice
agent="fence_manual" name="gfs1_ipmi"/>
>> <fencedevice
agent="fence_vmware" name="gfs_vmware"
>> ipaddr="
gdvcenter.ucsc.edu"
login="root" passwd="1hateAmazon.com"
>> vmlogin="root" vmpasswd="esxpass"
>>
port="/vmfs/volumes/49086551-c64fd83c-0401-001e0bcd6848/eagle1/gfs1.vmx"/>
>> </fencedevices>
>> <rm>
>> <failoverdomains/>
>> </rm>
>> </cluster>
>>
>> I've seen much discussion of this problem, but
no definitive solutions.
>> Any help you can provide will be welcome.
>>
>> Wes Modes
>>
>> --
>> Linux-cluster mailing list
>>
Linux-cluster@xxxxxxxxxx
>>
https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
>
Linux-cluster@xxxxxxxxxx
>
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster