Now that FC4 has been released (finally), I thought I'd post a note with
the challenges I met when upgrading my 1-node cluster from FC3.
I did a full reinstall of the operating system.
I tried to install the cluster packages using yum (both from the base
and the develop repositories), but found that old bugs that have been
fixed in cvs were still present in those packages.
In the end I kept device-mapper from 'base', but reinstalled both
'cluster' and 'LVM2' from source. My reason for reinstalling lvm2 was
simply that trying to add the lvm2-cluster package wanted to drag in ccs
and some others as rpm's. I didn't want that.
To fetch sources:
cvs -d :pserver:cvs@xxxxxxxxxxxxxxxxxx:/cvs/cluster login cvs
cvs -d :pserver:cvs@xxxxxxxxxxxxxxxxxx:/cvs/cluster checkout -r FC4 cluster
cvs -d :pserver:cvs@xxxxxxxxxxxxxxxxxx:/cvs/lvm2 checkout LVM2
To build I did
cd cluster
./configure --kernel_src=/lib/modules/`uname -r`/build/
make
make install
cd ../LVM2
./configure --with-clvmd --with-cluster=shared
make
make install
But it wasn't really that simple. :-(
Compilation of cluster failed in 2 places.
I removed -Werror from the options to gcc in magma/lib/Makefile to get
it to compile with a warning.
I also removed the 'static int' declaration of loglevel in
rgmanager/src/clulib/clulog.c
leavin gonly the initialisation of the variable.
With these 2 changes I got the thing installed and started.
My next problem was that I got a lot of
clurgmgrd[11283]: <notice> status on nfsclient "XXX" returned 1 (generic
error)
and I know from previous experience that this can give I/O errors on the
clients when rgmanager decides to unexport and reexport the file system,
so it has to get fixed.
My temporary fix is to edit /usr/share/cluster/nfsclient.sh and replace
this line
exportfs | grep -q "^${OCF_RESKEY_path}\ .*${OCF_RESKEY_target}"
with the line
grep -q "^${OCF_RESKEY_path}[ ]*${OCF_RESKEY_target}" /var/lib/nfs/etab
IMPORTANT NOTE: The characters between [] are a space and a tab.
For some strange reason exportfs doesn't list exported file systems...
I have one file system defined in /etc/exports, just to get nfs up at
boot. Just 'exportfs' doesn't list anything. 'exportfs|cat' sometimes
lists this one export, sometimes nothing. 'cat /var/lib/nfs/etab' shows
all exports, including those added by rgmanager.
Another note is that hosts may not show up in etab with the same name
you used in the nfsclient line in the config. I guess it uses the result
from a reverse lookup in DNS? Check messages, and if you get errors,
find the name used in etab and change your nfsclient line to use the
same name.
The only other issues I have run into (not cluster related) are:
The TSM backup client README states I need to have
compat-gcc-c++-7.3-2.96.122.i386.rpm to install the tsm rpm's. I found
that on FC4 I needed compat-libstdc++-33 instead. Not tested yet, just
installed.
Some problem getting the samba server to join the AD domain. It creates
the host in AD, but seems to hang forever afterwards. Perhaps time I
double checked that I have all the iptables openings with me from FC3
(where it used to work).
--
birger
--
Linux-cluster@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/linux-cluster