I intend to setup a Oracle9i RAC cluster using GFS 6.0 as the CFS. Because the SAN is not available atm so I decide to use GNBD instead. I have three IA64 servers running RHAS 3 update 6 called node1, node2 and node3. Node1 and node2 are GNBD clients and GFS nodes. Node3 is the GNBD server. I also use all of them as lock servers.
I followed the GFS 6.0 Administrator guide and encountered no problem until I tried to mount the GFS file system on node2. It took forever to run "mount -t gfs /dev/pool/pool0 /gfs -o acl". I killed the mount process and tried again on node1. This time it returned something like the error when you try to mount a unknown file system. I rmmod the gfs module and modprobe it again but still no luck. I checked against the startup procedure and found that although I had started lock_gulmd on all nodes but only node3 had a running instance. There was no sight of lock_gulmd on node1 and node2. I tried to start lock_gulmd again and after a few times, it got running just on node2 but mounting gfs still didnt work.
I didnt know what to do next so I decided to start over again. After chkconfig off and GFS related daemons, I restarted the servers (a bad habit from the Windows time :( ). After all the servers are up again, I got "gnbd_export error: create request failed : Connection refused" error when executing the following commands on node3 (in order to export device as GNBD):
# modprobe gndb_serv
# lsmod
[root@db-svr-test-03 root]# lsmod
Module Size Used by Not tainted
gnbd_serv 74288 0 (unused)
lock_gulm 149872 0 [gnbd_serv]
lock_harness 7288 0 [lock_gulm]
....
# gnbd_export -d /dev/cciss/c0d0p4 -e cluster.cca
gnbd_export error: create request failed : Connection refused
As you can see below, gnbd_serv was running and listening on the default port, 14243:
# netstat -nat
[root@db-svr-test-03 root]# netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:14243 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
.....
I also placed a tcpdump -vv -i lo port 14243 on node3 and saw that there were some traffic when I re-executed "gnbd_export -d /dev/cciss/c0d0p4 -e cluster.cca". it even passed the threeway handshark procedure but while the client side was pushing data the server suddenly sent a F packet.
I even removed the GFS and GFS-modules RPM and reinstalled them but still no luck. What am I supposed to do now? Any help appreciated.
Regards,
--Thai Duong.
-- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster