Are you using GFS or GFS2? GFS2 doesn't work yet, at least not in the
released RPMs.
Gordan
On Tue, 9 Oct 2007, James Fillman wrote:
Ok. I'm trying to implement GFS on two different clusters: 9 nodes, 17
nodes.
I'm having nothing but troubles. The gfs volumes are freezing and
throwing the cluster into a bad state. Currently, this is the state of
my cluster:
[root@plxp01md-new log]# cman_tool services
type level name id state
fence 0 default 00010004 none
[1 2 3 4 5 6 7 8 9]
dlm 1 clvmd 00010003 none
[1 2 3 4 5 6 7 8 9]
dlm 1 mdi_log 00020001 FAIL_START_WAIT
[1 2 3 4 6 7 8 9]
dlm 1 deploy 00040001 FAIL_START_WAIT
[1 4 6 7 8 9]
gfs 2 mdi_log 00010001 FAIL_START_WAIT
[1 2 3 4 6 7 8 9]
gfs 2 deploy 00030001 FAIL_START_WAIT
I have no idea what happened. I've got users who are writing to a gfs
volume and just came and reported to me that the volumes not responding.
/var/log/messages has been outputting the following message, about 50
times a second, since Friday:
Oct 9 13:54:35 plxp01deploy kernel: dlm: recover_master_copy -53 401ce
Can someone tell me what FAIL_START_WAIT means and how to recover from
it? Also, does anyone know what the log message above means?
All my servers in the cluster are showing the same service states.
I'm running RHEL5-64 bit.
please help. I'm almost ready to give up on GFS. It seems way too
unstable.
James Fillman
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster