Hi All,
I've successfully installed and configured GFS on my three nodes, but when I try to mount the filesystem the prompt hangs until I kill the mount command. All servers are running RHEL 3 AS/ES U6 with the 2.4.21-37.0.1.ELsmp kernel and are connected to a MSA1500 SAN via FC. I've installed the following GFS rpms:
[root@oradw root]# rpm -qa | grep -i gfs
GFS-modules-6.0.2.27-0.1
GFS-modules-smp-6.0.2.27-0.1
GFS-6.0.2.27-0.1
Here is my pool configuration files and the output from pool_tool -s
[root@backup gfs]# cat cluster_cca.cfg
poolname cluster_cca
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sda1
[root@backup gfs]# cat pool0.cfg
poolname pool_gfs1
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sda2
[root@backup gfs]# cat pool1.cfg
poolname pool_gfs2
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sdb
[root@backup gfs]# pool_tool -s
Device Pool Label
====== ==========
/dev/pool/cluster_cca <- CCA device ->
/dev/pool/pool_gfs1 <- GFS filesystem ->
/dev/pool/pool_gfs2 <- GFS filesystem ->
/dev/cciss/c0d0 <- partition information ->
/dev/cciss/c0d0p1 <- EXT2/3 filesystem ->
/dev/cciss/c0d0p2 <- swap device ->
/dev/cciss/c0d0p3 <- lvm1 subdevice ->
/dev/sda <- partition information ->
/dev/sda1 cluster_cca
/dev/sda2 pool_gfs1
/dev/sdb pool_gfs2
Here are my ccs files.
[root@backup cluster_cca]# cat cluster.ccs
cluster {
name = "cluster_cca"
lock_gulm {
servers = ["backup", "oradw", "gistest2"]
}
}
[root@backup cluster_cca]# cat fence.ccs
fence_devices {
manual {
agent = "fence_manual"
}
}
[root@backup cluster_cca]# cat nodes.ccs
nodes {
backup {
ip_interfaces {
eth1 = "10.0.0.1"
}
fence {
man {
manual {
ipaddr = " 10.0.0.1"
}
}
}
}
oradw {
ip_interfaces {
eth4 = " 10.0.0.2"
}
fence {
man {
manual {
ipaddr = " 10.0.0.2"
}
}
}
}
gistest2 {
ip_interfaces {
eth0 = " 10.0.0.3"
}
fence {
man {
manual {
ipaddr = " 10.0.0.3"
}
}
}
}
}
Here is the command I used to create the filesystem:
gfs_mkfs -p lock_gulm -t cluster_cca:pool_gfs2 -j 10 /dev/pool/pool_gfs2
Mount command that hangs:
mount -t gfs /dev/pool/pool_gfs2 /gfs2
Here is the output I see in my messages log file. I see the last 5 lines repeated for each time I tried to mount the filesystem.
Mar 17 15:47:05 backup ccsd[2645]: Starting ccsd 6.0.2.27:
Mar 17 15:47:05 backup ccsd[2645]: Built: Jan 30 2006 15:28:33
Mar 17 15:47:05 backup ccsd[2645]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Mar 17 15:48:10 backup lock_gulmd[2652]: Starting lock_gulmd 6.0.2.27. (built Jan 30 2006 15:28:54) Copyright (C) 2004 Red Hat, Inc. All rights reserved.
Mar 17 15:48:10 backup lock_gulmd[2652]: You are running in Fail-over mode.
Mar 17 15:48:10 backup lock_gulmd[2652]: I am (backup) with ip (127.0.0.1)
Mar 17 15:48:10 backup lock_gulmd[2652]: Forked core [2653].
Mar 17 15:48:11 backup lock_gulmd[2652]: Forked locktable [2654].
Mar 17 15:48:12 backup lock_gulmd[2652]: Forked ltpx [2655].
Mar 17 15:48:12 backup lock_gulmd_core[2653]: I see no Masters, So I am Arbitrating until enough Slaves talk to me.
Mar 17 15:48:12 backup lock_gulmd_core[2653]: Could not send quorum update to slave backup
Mar 17 15:48:12 backup lock_gulmd_core[2653]: New generation of server state. (1142628492484630)
Mar 17 15:48:12 backup lock_gulmd_LTPX[2655]: New Master at backup: 127.0.0.1
Mar 17 15:52:14 backup kernel: Lock_Harness 6.0.2.27 (built Jan 30 2006 15:32:58) installed
Mar 17 15:52:14 backup kernel: GFS 6.0.2.27 (built Jan 30 2006 15:32:20) installed
Mar 17 15:52:15 backup kernel: Gulm 6.0.2.27 (built Jan 30 2006 15:32:54) installed
Mar 17 15:54:51 backup kernel: lock_gulm: ERROR cm_login failed. -512
Mar 17 15:54:51 backup kernel: lock_gulm: ERROR Got a -512 trying to start the threads.
Mar 17 15:54:51 backup lock_gulmd_core[2653]: Error on xdr (GFS Kernel Interface:127.0.0.1 idx:3 fd:8): (-104:104:Connection reset by peer)
Mar 17 15:54:51 backup kernel: lock_gulm: fsid=cluster_cca:gfs1: Exiting gulm_mount with errors -512
Mar 17 15:54:51 backup kernel: GFS: can't mount proto = lock_gulm, table = cluster_cca:gfs1, hostdata =
Result from gulm_tool:
[root@backup gfs]# gulm_tool nodelist backup
Name: backup
ip = 127.0.0.1
state = Logged in
mode = Arbitrating
missed beats = 0
last beat = 1142632189718986
delay avg = 10019686
max delay = 10019735
I'm a newbie to clusters and I have no clue where to look next. If any other information is needed let me know.
Thanks,
I've successfully installed and configured GFS on my three nodes, but when I try to mount the filesystem the prompt hangs until I kill the mount command. All servers are running RHEL 3 AS/ES U6 with the 2.4.21-37.0.1.ELsmp kernel and are connected to a MSA1500 SAN via FC. I've installed the following GFS rpms:
[root@oradw root]# rpm -qa | grep -i gfs
GFS-modules-6.0.2.27-0.1
GFS-modules-smp-6.0.2.27-0.1
GFS-6.0.2.27-0.1
Here is my pool configuration files and the output from pool_tool -s
[root@backup gfs]# cat cluster_cca.cfg
poolname cluster_cca
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sda1
[root@backup gfs]# cat pool0.cfg
poolname pool_gfs1
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sda2
[root@backup gfs]# cat pool1.cfg
poolname pool_gfs2
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sdb
[root@backup gfs]# pool_tool -s
Device Pool Label
====== ==========
/dev/pool/cluster_cca <- CCA device ->
/dev/pool/pool_gfs1 <- GFS filesystem ->
/dev/pool/pool_gfs2 <- GFS filesystem ->
/dev/cciss/c0d0 <- partition information ->
/dev/cciss/c0d0p1 <- EXT2/3 filesystem ->
/dev/cciss/c0d0p2 <- swap device ->
/dev/cciss/c0d0p3 <- lvm1 subdevice ->
/dev/sda <- partition information ->
/dev/sda1 cluster_cca
/dev/sda2 pool_gfs1
/dev/sdb pool_gfs2
Here are my ccs files.
[root@backup cluster_cca]# cat cluster.ccs
cluster {
name = "cluster_cca"
lock_gulm {
servers = ["backup", "oradw", "gistest2"]
}
}
[root@backup cluster_cca]# cat fence.ccs
fence_devices {
manual {
agent = "fence_manual"
}
}
[root@backup cluster_cca]# cat nodes.ccs
nodes {
backup {
ip_interfaces {
eth1 = "10.0.0.1"
}
fence {
man {
manual {
ipaddr = " 10.0.0.1"
}
}
}
}
oradw {
ip_interfaces {
eth4 = " 10.0.0.2"
}
fence {
man {
manual {
ipaddr = " 10.0.0.2"
}
}
}
}
gistest2 {
ip_interfaces {
eth0 = " 10.0.0.3"
}
fence {
man {
manual {
ipaddr = " 10.0.0.3"
}
}
}
}
}
Here is the command I used to create the filesystem:
gfs_mkfs -p lock_gulm -t cluster_cca:pool_gfs2 -j 10 /dev/pool/pool_gfs2
Mount command that hangs:
mount -t gfs /dev/pool/pool_gfs2 /gfs2
Here is the output I see in my messages log file. I see the last 5 lines repeated for each time I tried to mount the filesystem.
Mar 17 15:47:05 backup ccsd[2645]: Starting ccsd 6.0.2.27:
Mar 17 15:47:05 backup ccsd[2645]: Built: Jan 30 2006 15:28:33
Mar 17 15:47:05 backup ccsd[2645]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Mar 17 15:48:10 backup lock_gulmd[2652]: Starting lock_gulmd 6.0.2.27. (built Jan 30 2006 15:28:54) Copyright (C) 2004 Red Hat, Inc. All rights reserved.
Mar 17 15:48:10 backup lock_gulmd[2652]: You are running in Fail-over mode.
Mar 17 15:48:10 backup lock_gulmd[2652]: I am (backup) with ip (127.0.0.1)
Mar 17 15:48:10 backup lock_gulmd[2652]: Forked core [2653].
Mar 17 15:48:11 backup lock_gulmd[2652]: Forked locktable [2654].
Mar 17 15:48:12 backup lock_gulmd[2652]: Forked ltpx [2655].
Mar 17 15:48:12 backup lock_gulmd_core[2653]: I see no Masters, So I am Arbitrating until enough Slaves talk to me.
Mar 17 15:48:12 backup lock_gulmd_core[2653]: Could not send quorum update to slave backup
Mar 17 15:48:12 backup lock_gulmd_core[2653]: New generation of server state. (1142628492484630)
Mar 17 15:48:12 backup lock_gulmd_LTPX[2655]: New Master at backup: 127.0.0.1
Mar 17 15:52:14 backup kernel: Lock_Harness 6.0.2.27 (built Jan 30 2006 15:32:58) installed
Mar 17 15:52:14 backup kernel: GFS 6.0.2.27 (built Jan 30 2006 15:32:20) installed
Mar 17 15:52:15 backup kernel: Gulm 6.0.2.27 (built Jan 30 2006 15:32:54) installed
Mar 17 15:54:51 backup kernel: lock_gulm: ERROR cm_login failed. -512
Mar 17 15:54:51 backup kernel: lock_gulm: ERROR Got a -512 trying to start the threads.
Mar 17 15:54:51 backup lock_gulmd_core[2653]: Error on xdr (GFS Kernel Interface:127.0.0.1 idx:3 fd:8): (-104:104:Connection reset by peer)
Mar 17 15:54:51 backup kernel: lock_gulm: fsid=cluster_cca:gfs1: Exiting gulm_mount with errors -512
Mar 17 15:54:51 backup kernel: GFS: can't mount proto = lock_gulm, table = cluster_cca:gfs1, hostdata =
Result from gulm_tool:
[root@backup gfs]# gulm_tool nodelist backup
Name: backup
ip = 127.0.0.1
state = Logged in
mode = Arbitrating
missed beats = 0
last beat = 1142632189718986
delay avg = 10019686
max delay = 10019735
I'm a newbie to clusters and I have no clue where to look next. If any other information is needed let me know.
Thanks,
--
Magnus Andersen
Systems Administrator / Oracle DBA
Walker & Associates, Inc.
-- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster