With the RDM created and all the daemons started (luci,
ricci, cman) now I can config GFS. Make sure they are
running on all of our nodes.
We can even see the RDM on the guest systems:
[root@test03]# ls /dev/sdb
/dev/sdb
[root@test04]# ls /dev/sdb
/dev/sdb
So we are doing this using lvm clustering: http://emrahbaysal.blogspot.com/2011/03/gfs-cluster-on-vmware-vsphere-rh...
and http://linuxdynasty.org/215/howto-setup-gfs2-with-clustering/
We've already set up gfs daemons and fencing and whatnot.
Before we start to create the LVM2 volumes and Proceed to
GFS2, we will need to enable clustering in LVM2.
[root@test03]# lvmconf --enable-cluster
I try to create the cluster FS
[root@test03]# pvcreate /dev/sdb
connect() failed on local socket: No such file or
directory
Internal cluster locking initialisation failed.
WARNING: Falling back to local file-based locking.
Volume Groups with the clustered attribute will be
inaccessible.
Physical volume "/dev/sdb" successfully created
One internet source says:
>> That indicates that you have cluster locking enabled but that the cluster LVM
>> daemon (clvmd) is not running.
So let's start it,
[root@test03]# service clvmd status
clvmd is stopped
[root@test03]# service clvmd start
Starting clvmd:
Activating VG(s): 2 logical volume(s) in volume group "VolGroup00" now active
clvmd not running on node test04
[ OK ]
[root@test03]# chkconfig clvmd on
Okay, over on the other node:
[root@test04]# service clvmd status
clvmd is stopped
[root@test04]# service clvmd start
Starting clvmd: clvmd could not connect to cluster manager
Consult syslog for more information
[root@test04]# service cman status
groupd is stopped
[root@test04]# service cman start
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... done
Starting daemons... done
Starting fencing... done
[ OK ]
[root@test04]# chkconfig cman on
[root@test04]# service luci status
luci is running...
[root@test04]# service ricci status
ricci (pid 4381) is running...
[root@test04]# chkconfig ricci on
[root@test04]# chkconfig luci on
[root@test04]# service clvmd start
Starting clvmd:
Activating VG(s): 2 logical volume(s) in volume group "VolGroup00" now active
[ OK ]
And this time, no complaints:
[root@test03]# service clvmd restart
Restarting clvmd: [ OK ]
Try again with pvcreate:
[root@test03]# pvcreate /dev/sdb
Physical volume "/dev/sdb" successfully created
Create volume group:
[root@test03]# vgcreate gdcache_vg /dev/sdb
Clustered volume group "gdcache_vg" successfully created
Create logical volume:
[root@test03]# lvcreate -n gdcache_lv -L 2T gdcache_vg
Logical volume "gdcache_lv" created
Create GFS filesystem, ahem, GFS2 filesystem. I screwed
this up the first time.
[root@test03]# mkfs.gfs2 -j 8 -p lock_dlm -t gdcluster:gdcache -j 4 /dev/mapper/gdcache_vg-gdcache_lv
This will destroy any data on /dev/mapper/gdcache_vg-gdcache_lv.
It appears to contain a gfs filesystem.
Are you sure you want to proceed? [y/n] y
Device: /dev/mapper/gdcache_vg-gdcache_lv
Blocksize: 4096
Device Size 2048.00 GB (536870912 blocks)
Filesystem Size: 2048.00 GB (536870910 blocks)
Journals: 4
Resource Groups: 8192
Locking Protocol: "lock_dlm"
Lock Table: "gdcluster:gdcache"
UUID: 0542628C-D8B8-2480-F67D-081435F38606
Okay! And! Finally! We mount it!
[root@test03]# mount /dev/mapper/gdcache_vg-gdcache_lv /data
/sbin/mount.gfs: fs is for a different cluster
/sbin/mount.gfs: error mounting lockproto lock_dlm
Wawawwah. Bummer.
/var/log/messages says:
Jan 19 14:21:05 test03 gfs_controld[3369]: mount: fs requires cluster="gdcluster" current="gdao_cluster"
Someone on the interwebs concurs:
the cluster name defined in /etc/cluster/cluster.conf is
different from the one tagged on the GFS volume.
Okay, so looking at cluster.conf:
[root@test03]# vi /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="25" name="gdao_cluster">
Let's change that to match how I named the cluster in the
above cfg_mkfs
[root@test03]# vi /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="25" name="gdcluster">
And restart some stuff:
[root@test03]# /etc/init.d/gfs2 stop
[root@test03]# service luci stop
Shutting down luci: service ricci [ OK ]
[root@test03]# service ricci stop
Shutting down ricci: [ OK ]
[root@test03]# service cman stop
Stopping cluster:
Stopping fencing... done
Stopping cman... failed
/usr/sbin/cman_tool: Error leaving cluster: Device or resource busy
[FAILED]
[root@test03]# cman_tool leave force
[root@test03]# service cman stop
Stopping cluster:
Stopping fencing... done
Stopping cman... done
Stopping ccsd... done
Unmounting configfs... done
[ OK ]
AAAARRRRGGGHGHHH
[root@test03]# service ricci start
Starting ricci: [ OK ]
[root@test03]# service luci start
Starting luci: [ OK ]
Point your web browser to https://test03.gdao.ucsc.edu:8084 to access luci
[root@test03]# service gfs2 start
[root@test03]# service cman start
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... done
Starting daemons... done
Starting fencing... failed
[FAILED]
I had to reboot.
[root@test03]# service luci status
luci is running...
[root@test03]# service ricci status
ricci (pid 4385) is running...
[root@test03]# service cman status
cman is running.
[root@test03]# service gfs2 status
Okay, again?
[root@test03]# mount /dev/mapper/gdcache_vg-gdcache_lv /data
Did that just work? And on test04
[root@test04]# mount /dev/mapper/gdcache_vg-gdcache_lv /data
Okay, how about a test:
[root@test03]# touch /data/killme
And then we look on the other node:
[root@test04]# ls /data
killme
Holy shit.
I've been working so hard for this moment that I don't
completely know what to do now.
Question is, now that I have two working nodes, can I
duplicate it?
Okay, finish up:
[root@test03]# chkconfig rgmanager on
[root@test03]# service rgmanager start
Starting Cluster Service Manager: [ OK ]
[root@test03]# vi /etc/fstab
/dev/mapper/gdcache_vg-gdcache_lv /data gfs2 defaults,noatime,nodiratime 0 0
and on the other node:
[root@test04]# chkconfig rgmanager on
[root@test04]# service rgmanager start
Starting Cluster Service Manager:
[root@test04]# vi /etc/fstab
/dev/mapper/gdcache_vg-gdcache_lv /data gfs2 defaults,noatime,nodiratime 0 0
And it works. Hell, yeah.