GFS assertion failure

"Ben Yarwood" <ben.yarwood@xxxxxxxxxx> · Mon, 21 Jul 2008 10:29:07 +0100

I have a three node cluster running latest 4.6 code with 14 gfs file systems running.  On a three month old, heavily used gfs file
system which has never had any problems, had no shared storage power outages or anything that I can think of that could have caused
a problem in the fs, I got the following error and a withdraw:

Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: fatal: assertion "FALSE" failed
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2:   function = xmote_bh
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2:   file =
/builddir/build/BUILD/gfs-kernel-2.6.9-75/smp/src/gfs/glock.c, line = 1093
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2:   time = 1216415126
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: about to withdraw from the cluster
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: waiting for outstanding I/O
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: telling LM to withdraw
Jul 18 22:05:27 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: withdrawn
Jul 18 22:05:27 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: ret = 0x00000002

The file system wouldn't unmount after this unfortunately and the only way to get the node up and running again was to do a fence.
I checked bugzilla and can't find anything still open relating to this.

Can anyone:

1.  Suggest a good strategy for trying to get the fs unmounted so that a fence is not required and a normal reboot can be done?
2.  Suggest what information I should have captured to better help debugging in the future, I think this would make a good FAQ and
be helpful to all.

Finally in the FAQ it says that after a gfs withdraws, the node should be rebooted before remounting, is this correct and is this
related to replaying journals?  What would happen if you didn't reboot?

Cheers
Ben

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster