Hi, this is from an FC4/x86_64 node that forms a cluster with three RHEL4/x86_64 nodes. All of them running latest errata kernels and (vendor packaged) cluster/gfs bits. a) to start with: is it OK to mix FC4 and RHEL4, or did I do something forbidden? b) the cluster wasn't doing anything with the GFS filesystem at that time, i.e. it was just mounted on all 4 nodes, no data was being moved in any direction. c) The other nodes correctly replayed the journal, This node was removed from the cluster w/o fencing and w/o any traces in the logs other than gfs' "about to withdraw from the cluster". I expected cman to report this, too. The other nodes' logs only contained information about the journal acquisition and replay. d) There is a 10 min. delay from the moment of the mysterious filesystem consistency error and a series of Glock messages e) And most importantly why did the gfs issue a filesystem consistency error upon a simple umount? FC4 vs RHEL4 issue? Thanks! Aug 11 19:11:48 zs01 rgmanager: [25900]: <notice> Shutting down Cluster Service Manager... Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Shutting down Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Stopping service homes-cifs Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Stopping service backup Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Service homes-cifs is stopped Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Service backup is stopped Aug 11 19:11:52 zs01 clurgmgrd[3660]: <notice> Shutdown complete, exiting Aug 11 19:11:53 zs01 rgmanager: [25900]: <notice> Cluster Service Manager is stopped. Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: fatal: filesystem consistency error Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: function = trans_go_xmote_bh Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: file = /usr/src/build/588747-x86_64/BUILD/smp/src/gfs/glops.c, line = 542 Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: time = 1123780394 Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: about to withdraw from the cluster Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: waiting for outstanding I/O Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: telling LM to withdraw Aug 11 19:13:27 zs01 kernel: lock_dlm: withdraw abandoned memory Aug 11 19:13:27 zs01 kernel: GFS: fsid=physik:data.2: withdrawn Aug 11 19:23:27 zs01 kernel: ror = 0 Aug 11 19:23:27 zs01 kernel: gh_iflags = 2 4 5 Aug 11 19:23:27 zs01 kernel: Glock (5, 8676146) Aug 11 19:23:27 zs01 kernel: gl_flags = 1 Aug 11 19:23:27 zs01 kernel: gl_count = 3 Aug 11 19:23:27 zs01 kernel: gl_state = 3 Aug 11 19:23:27 zs01 kernel: req_gh = yes Aug 11 19:23:27 zs01 kernel: req_bh = yes Aug 11 19:23:27 zs01 kernel: lvb_count = 0 Aug 11 19:23:27 zs01 kernel: object = no Aug 11 19:23:27 zs01 kernel: new_le = no Aug 11 19:23:27 zs01 kernel: incore_le = no Aug 11 19:23:27 zs01 kernel: reclaim = no Aug 11 19:23:27 zs01 kernel: aspace = no Aug 11 19:23:27 zs01 kernel: ail_bufs = no Aug 11 19:23:27 zs01 kernel: Request Aug 11 19:23:27 zs01 kernel: owner = -1 Aug 11 19:23:27 zs01 kernel: gh_state = 0 Aug 11 19:23:27 zs01 kernel: gh_flags = 0 Aug 11 19:23:27 zs01 kernel: error = 0 Aug 11 19:23:27 zs01 kernel: gh_iflags = 2 4 5 Aug 11 19:23:27 zs01 kernel: Waiter2 Aug 11 19:23:27 zs01 kernel: owner = -1 Aug 11 19:23:27 zs01 kernel: gh_state = 0 Aug 11 19:23:27 zs01 kernel: gh_flags = 0 Aug 11 19:23:27 zs01 kernel: error = 0 Aug 11 19:23:27 zs01 kernel: gh_iflags = 2 4 5 Aug 11 19:23:27 zs01 kernel: Glock (5, 7146196) Aug 11 19:23:27 zs01 kernel: gl_flags = 1 Aug 11 19:23:27 zs01 kernel: gl_count = 3 Aug 11 19:23:27 zs01 kernel: gl_state = 3 Aug 11 19:23:27 zs01 kernel: req_gh = yes Aug 11 19:23:27 zs01 kernel: req_bh = yes Aug 11 19:23:27 zs01 kernel: lvb_count = 0 Aug 11 19:23:27 zs01 kernel: object = no Aug 11 19:23:27 zs01 kernel: new_le = no Aug 11 19:23:27 zs01 kernel: incore_le = no Aug 11 19:23:27 zs01 kernel: reclaim = no Aug 11 19:23:27 zs01 kernel: aspace = no Aug 11 19:23:27 zs01 kernel: ail_bufs = no Aug 11 19:23:27 zs01 kernel: Request Aug 11 19:23:27 zs01 kernel: owner = -1 Aug 11 19:23:27 zs01 kernel: gh_state = 0 Aug 11 19:23:27 zs01 kernel: gh_flags = 0 Aug 11 19:23:27 zs01 kernel: error = 0 Aug 11 19:23:27 zs01 kernel: gh_iflags = 2 4 5 Aug 11 19:23:27 zs01 kernel: Waiter2 Aug 11 19:23:27 zs01 kernel: owner = -1 Aug 11 19:23:27 zs01 kernel: gh_state = 0 Aug 11 19:23:27 zs01 kernel: gh_flags = 0 Aug 11 19:23:27 zs01 kernel: error = 0 Aug 11 19:23:27 zs01 kernel: gh_iflags = 2 4 5 Aug 11 19:23:27 zs01 kernel: Glock (5, 190905665) [...] -- Axel.Thimm at ATrpms.net
Attachment:
pgpdQRYk6Gu0n.pgp
Description: PGP signature
-- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster