Hi All,
I m new to Redhat GFS. I got the GSF code form http://sources.redhat.com/cluster site (from CVS
With tag - TRHEL4). I compiled and installed the GSF for source code. I followed the steps mentioned in
cluster/doc/min-gfs.txt file. I want to use GFS using GNBD server with 3 machines.
The cluster/doc/min-gfs.txt file looks like this:
Minimum GFS How To
-----------------
The following gfs configuration requires a minimum amount of hardware and
no expensive storage system. It's the cheapest and quickest way to "play"
with gfs.
-------------- --------------
| GNBD | | GNBD |
| client | | client | <-- these nodes use gfs
| node2 | | node3 |
------------- -------------
| |
------------------ IP network
|
--------------
| GNBD |
| server | <-- this node doesn't use gfs
| node1 |
---------------
- There are three machines to use with hostnames: node1, node2, node3
- node1 has an extra disk /dev/sda1 to use for gfs
(this could be hda1 or an lvm LV or an md device)
- node1 will use gnbd to export this disk to node2 and node3
- Node1 cannot use gfs, it only acts as a gnbd server.
(Node1 will /not/ actually be part of the cluster since it is only
running the gnbd server.)
- Only node2 and node3 will be in the cluster and use gfs.
(A two-node cluster is a special case for cman, noted in the config below.)
- There's not much point to using clvm in this setup so it's left out.
- Download the "cluster" source tree.
- Build and install from the cluster source tree. (The kernel components
are not required on node1 which will only need the gnbd_serv program.)
cd cluster
./configure --kernel_src=/path/to/kernel
make; make install
- Create /etc/cluster/cluster.conf on node2 with the following contents:
<?xml version="1.0"?>
<cluster name="gamma" config_version="1">
<cman two_node="1" expected_votes="1">
</cman>
<clusternodes>
<clusternode name="node2">
<fence>
<method name="single">
<device name="gnbd" ipaddr="node2"/>
</method>
</fence>
</clusternode>
<clusternode name="node3">
<fence>
<method name="single">
<device name="gnbd" ipaddr="node3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="gnbd" agent="fence_gnbd" servers="node1"/>
</fencedevices>
</cluster>
- load kernel modules on nodes
node2 and node3> modprobe gnbd
node2 and node3> modprobe gfs
node2 and node3> modprobe lock_dlm
- run the following commands
node1> gnbd_serv -n
node1> gnbd_export -c -d /dev/sda1 -e global_disk
node2 and node3> gnbd_import -i node1
node2 and node3> ccsd
node2 and node3> cman_tool join
node2 and node3> fence_tool join
node2> gfs_mkfs -p lock_dlm -t gamma:gfs1 -j 2 /dev/gnbd/global_disk
node2 and node3> mount -t gfs /dev/gnbd/global_disk /mnt
- the end, you now have a gfs file system mounted on node2 and node3
Appendix A
----------
To use manual fencing instead of gnbd fencing, the cluster.conf file
would look like this:
<?xml version="1.0"?>
<cluster name="gamma" config_version="1">
<cman two_node="1" expected_votes="1">
</cman>
<clusternodes>
<clusternode name="node2">
<fence>
<method name="single">
<device name="manual" ipaddr="node2"/>
</method>
</fence>
</clusternode>
<clusternode name="node3">
<fence>
<method name="single">
<device name="manual" ipaddr="node3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="manual" agent="fence_manual"/>
</fencedevices>
</cluster>
FAQ
---
- Why can't node3 use gfs, too?
You might be able to make it work, but we recommend that you not try.
This software was not intended or designed to allow that kind of usage.
- Isn't node3 a single point of failure? how do I avoid that?
Yes it is. For the time being, there's no way to avoid that, apart from
not using gnbd, of course. Eventually, there will be a way to avoid this
using cluster mirroring.
- More info from
http://sources.redhat.com/cluster/gnbd/gnbd_usage.txt
http://sources.redhat.com/cluster/doc/usage.txt
Following commands have been executed on node-1:
[root@localhost ~]# gnbd_serv -n
gnbd_serv: startup succeeded
[root@localhost ~]# gnbd_export -c -d /dev/sda5 -e global_disk
gnbd_export: created GNBD global_disk serving file /dev/sda5
[root@localhost ~]# gnbd_export -v
Server[1] : global_disk
--------------------------
file : /dev/sda5
sectors : 24820362
readonly : no
cached : yes
timeout : no
uid :
[root@localhost ~]# ps ax| grep gnbd
12571 ? S 0:00 gnbd_serv -n
12607 ? S 0:00 gnbd_serv -n
12609 pts/3 S+ 0:00 grep gnbd
[root@localhost ~]#
But I m getting following messages in /var/log/messages from node-1 (GNBD server machine):
Jul 18 14:34:06 localhost gnbd_serv[12571]: startup succeeded
Jul 18 14:37:35 localhost gnbd_serv[12571]: server process 12596 exited because of signal 15
Jul 18 14:37:40 localhost gnbd_serv[12571]: server process 12597 exited because of signal 15
Jul 18 14:37:45 localhost gnbd_serv[12571]: server process 12598 exited because of signal 15
Jul 18 14:37:50 localhost gnbd_serv[12571]: server process 12599 exited because of signal 15
Jul 18 14:37:55 localhost gnbd_serv[12571]: server process 12600 exited because of signal 15
Jul 18 14:38:00 localhost gnbd_serv[12571]: server process 12601 exited because of signal 15
Jul 18 14:38:05 localhost gnbd_serv[12571]: server process 12602 exited because of signal 15
Jul 18 14:38:10 localhost gnbd_serv[12571]: server process 12603 exited because of signal 15
Jul 18 14:38:15 localhost gnbd_serv[12571]: server process 12604 exited because of signal 15
Jul 18 14:38:20 localhost gnbd_serv[12571]: server process 12605 exited because of signal 15
Jul 18 14:38:25 localhost gnbd_serv[12571]: server process 12606 exited because of signal 15
Following commands have been executed on node-2 and node-3:
[root@localhost ~]# modprobe gnbd
[root@localhost ~]# modprobe gfs
[root@localhost ~]# modprobe lock_dlm
[root@localhost
~]# gnbd_import -n -i 172.16.222.63
gnbd_import: created directory /dev/gnbd
gnbd_import: created gnbd device global_disk
gnbd_recvd: gnbd_recvd started
[root@localhost ~]# ccsd
And following messages in /var/log/messages from node-2 and node-3 (GNBD client mchines):
Jul 18 09:09:19 localhost kernel: gnbd: registered device at major 252
Jul 18 09:09:21 localhost hald[2759]: Timed out waiting for hotplug event 318. Rebasing to 574
Jul 18 09:10:41 localhost kernel: CMAN <CVS> (built Jul 17 2006 09:01:33) installed
Jul 18 09:10:41 localhost kernel: NET: Registered protocol family 30
Jul 18 09:10:41 localhost kernel: Lock_Harness <CVS> (built Jul 17 2006 09:01:49) installed
Jul 18 09:10:41 localhost kernel: gfs: no version for "kcl_get_node_by_nodeid" found: kernel tainted.
Jul 18 09:10:41 localhost kernel: GFS <CVS> (built Jul 17 2006 09:02:14) installed
Jul 18 09:10:57 localhost kernel: DLM <CVS> (built Jul 17 2006 09:01:45) installed
Jul 18 09:10:57 localhost kernel: Lock_DLM (built Jul 17 2006 09:01:53) installed
Jul 18 09:15:03 localhost gnbd_recvd[6334]: gnbd_recvd started
Jul 18 09:15:03 localhost kernel: resending requests
Jul 18 09:15:41 localhost gnbd_recvd[6334]: client lost connection with
172.16.222.63 : Broken pipe
Jul 18 09:15:41 localhost gnbd_recvd[6334]: reconnecting
Jul 18 09:15:41 localhost kernel: gnbd0: Receive control failed (result -32)
Jul 18 09:15:41 localhost kernel: gnbd0: shutting down socket
Jul 18 09:15:41 localhost kernel: exitting GNBD_DO_IT ioctl
Jul 18 09:15:46 localhost kernel: resending requests
Jul 18 09:15:51 localhost gnbd_recvd[6334]: client lost connection with
172.16.222.63 : Broken pipe
Jul 18 09:15:51 localhost gnbd_recvd[6334]: reconnecting
Jul 18 09:15:51 localhost kernel: gnbd0: Receive control failed (result -32)
Jul 18 09:15:51 localhost kernel: gnbd0: shutting down socket
Jul 18 09:15:51 localhost kernel: exitting GNBD_DO_IT ioctl
Jul 18 09:15:56 localhost kernel: resending requests
Jul 18 09:15:58 localhost ccsd[6336]: Starting ccsd DEVEL.1153141288:
Jul 18 09:15:58 localhost ccsd[6336]: Built: Jul 17 2006 09:02:27
Jul 18 09:15:58 localhost ccsd[6336]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Jul 18 09:16:01 localhost gnbd_recvd[6334]: client lost connection with 172.16.222.63
: Broken pipe
Jul 18 09:16:01 localhost gnbd_recvd[6334]: reconnecting
Jul 18 09:16:01 localhost kernel: gnbd0: Receive control failed (result -32)
Jul 18 09:16:01 localhost kernel: gnbd0: shutting down socket
Jul 18 09:16:01 localhost kernel: exitting GNBD_DO_IT ioctl
Jul 18 09:16:06 localhost kernel: resending requests
Jul 18 09:16:11 localhost gnbd_recvd[6334]: client lost connection with
172.16.222.63 : Broken pipe
Jul 18 09:16:11 localhost gnbd_recvd[6334]: reconnecting
Jul 18 09:16:11 localhost kernel: gnbd0: Receive control failed (result -32)
Jul 18 09:16:11 localhost kernel: gnbd0: shutting down socket
Jul 18 09:16:11 localhost kernel: exitting GNBD_DO_IT ioctl
Jul 18 09:16:16 localhost kernel: resending requests
Jul 18 09:16:21 localhost gnbd_recvd[6334]: client lost connection with
172.16.222.63 : Broken pipe
Jul 18 09:16:21 localhost gnbd_recvd[6334]: reconnecting
Jul 18 09:16:21 localhost kernel: gnbd0: Receive control failed (result -32)
Jul 18 09:16:21 localhost kernel: gnbd0: shutting down socket
Jul 18 09:16:21 localhost kernel: exitting GNBD_DO_IT ioctl
Jul 18 09:16:26 localhost kernel: resending requests
Jul 18 09:16:27 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 30 seconds.
Jul 18 09:16:31 localhost gnbd_recvd[6334]: client lost connection with 172.16.222.63 : Broken pipe
Jul 18 09:16:31 localhost gnbd_recvd[6334]: reconnecting
Jul 18 09:16:31 localhost kernel: gnbd0: Receive control failed (result -32)
Jul 18 09:16:31 localhost kernel: gnbd0: shutting down socket
Jul 18 09:16:31 localhost kernel: exitting GNBD_DO_IT ioctl
Jul 18 09:16:57 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 60 seconds.
Jul 18 09:17:27 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 90 seconds.
Jul 18 09:17:57 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 120 seconds.
Jul 18 09:18:27 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 150 seconds.
Jul 18 09:18:57 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 180 seconds.
Jul 18 09:19:27 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 210 seconds.
Jul 18 09:19:57 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 240 seconds.
Jul 18 09:20:27 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 270 seconds.
Jul 18 09:20:57 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 300 seconds.
Jul 18 09:21:27 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 330 seconds.
Jul 18 09:21:57 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 360 seconds.
Jul 18 09:22:28 localhost ccsd[6336]: Unable to connect to cluster infrastructure after 390 seconds.
My /etc/cluster/cluster.conf file looks like:
<?xml version="1.0"?>
<cluster name="gamma" config_version="1">
<cman two_node="1" expected_votes="1">
</cman>
<clusternodes>
<clusternode name="172.16.222.128">
<fence>
<method name="single">
<device name="gnbd" ipaddr="
172.16.222.128"/>
</method>
</fence>
</clusternode>
<clusternode name="172.16.222.62">
<fence>
<method name="single">
<device name="gnbd" ipaddr="
172.16.222.62"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="gnbd" agent="fence_gnbd" servers="172.16.222.63"/>
</fencedevices>
</cluster>
And finally I m not able to use GFS. If any body has any idea please help me...
With Regards
Rajesh.
-- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster