I have a problem with a gfs2 filesystem that is (was) being mounted from
a single host. The system appeared to have hung over the weekend so I
unmounted and remounted the disk. After a couple of minutes I received
this in the kernel logs:
Mar 15 08:28:50 localhost kernel: GFS2: fsid=: Trying to join cluster
"lock_nolock", "sde1"
Mar 15 08:28:50 localhost kernel: GFS2: fsid=sde1.0: Now mounting FS...
Mar 15 08:28:50 localhost kernel: GFS2: fsid=sde1.0: jid=0, already
locked for use
Mar 15 08:28:50 localhost kernel: GFS2: fsid=sde1.0: jid=0: Looking at
journal...
Mar 15 08:28:50 localhost kernel: GFS2: fsid=sde1.0: jid=0: Done
Mar 15 08:43:37 localhost kernel: GFS2: fsid=sde1.0: fatal: invalid
metadata block
Mar 15 08:43:37 localhost kernel: GFS2: fsid=sde1.0: bh = 4294972166
(type: exp=3, found=2)
Mar 15 08:43:37 localhost kernel: GFS2: fsid=sde1.0: function =
gfs2_rgrp_bh_get, file = fs/gfs2/rgrp.c, line = 759
Mar 15 08:43:37 localhost kernel: GFS2: fsid=sde1.0: about to withdraw
this file system
Mar 15 08:43:37 localhost kernel: GFS2: fsid=sde1.0: withdrawn
Mar 15 08:43:37 localhost kernel: Pid: 3687, comm: cp Not tainted
2.6.32-gentoo-r7 #2
Mar 15 08:43:37 localhost kernel: Call Trace:
Mar 15 08:43:37 localhost kernel: [<ffffffffa03b285d>] ?
gfs2_lm_withdraw+0x12d/0x160 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffff813bf22b>] ?
io_schedule+0x4b/0x70
Mar 15 08:43:37 localhost kernel: [<ffffffff810cc560>] ?
sync_buffer+0x0/0x50
Mar 15 08:43:37 localhost kernel: [<ffffffff813bf7a9>] ?
out_of_line_wait_on_bit+0x79/0xa0
Mar 15 08:43:37 localhost kernel: [<ffffffff8104e740>] ?
wake_bit_function+0x0/0x30
Mar 15 08:43:37 localhost kernel: [<ffffffff810cb162>] ?
submit_bh+0x112/0x140
Mar 15 08:43:37 localhost kernel: [<ffffffffa03b2947>] ?
gfs2_metatype_check_ii+0x47/0x60 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa03ae40b>] ?
gfs2_rgrp_bh_get+0x1db/0x300 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa0397d86>] ?
do_promote+0x116/0x200 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa03992a5>] ?
finish_xmote+0x1a5/0x3a0 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa0398fcd>] ?
do_xmote+0xfd/0x230 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa039986d>] ?
gfs2_glock_nq+0x13d/0x320 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa03aea2d>] ?
gfs2_inplace_reserve_i+0x1ed/0x7b0 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa0399581>] ?
run_queue+0xe1/0x210 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa039986d>] ?
gfs2_glock_nq+0x13d/0x320 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffffa03a1f92>] ?
gfs2_write_begin+0x272/0x480 [gfs2]
Mar 15 08:43:37 localhost kernel: [<ffffffff8106df04>] ?
generic_file_buffered_write+0x114/0x290
Mar 15 08:43:37 localhost kernel: [<ffffffff8106e4a8>] ?
__generic_file_aio_write+0x278/0x450
Mar 15 08:43:37 localhost kernel: [<ffffffff8106e6d5>] ?
generic_file_aio_write+0x55/0xb0
Mar 15 08:43:37 localhost kernel: [<ffffffff810a6a1b>] ?
do_sync_write+0xdb/0x120
Mar 15 08:43:37 localhost kernel: [<ffffffff8104e710>] ?
autoremove_wake_function+0x0/0x30
Mar 15 08:43:37 localhost kernel: [<ffffffff8108511f>] ?
handle_mm_fault+0x1bf/0x850
Mar 15 08:43:37 localhost kernel: [<ffffffff8108b5cc>] ?
mmap_region+0x23c/0x5d0
Mar 15 08:43:37 localhost kernel: [<ffffffff810a752b>] ?
vfs_write+0xcb/0x160
Mar 15 08:43:37 localhost kernel: [<ffffffff810a76c3>] ? sys_write+0x53/0xa0
Mar 15 08:43:37 localhost kernel: [<ffffffff8100b2ab>] ?
system_call_fastpath+0x16/0x1b
I again unmounted the disk but now when I try to fsck the filesystem I get:
urania# fsck.gfs2 -v /dev/sde1
Initializing fsck
Initializing lists...
Either the super block is corrupted, or this is not a GFS2 filesystem
The server is a running kernel 2.6.32, 64-bit. The array is a Jetstore
516iS with a single 28TB iSCSI volume defined. The relevant line from
the fstab is
/dev/sde1 /illumina gfs2 _netdev,rw,lockproto=lock_nolock
gfs2_tool isn't much help, nor is gfs2_edit:
urania# gfs2_tool sb /dev/sde1 all
/usr/src/cluster-3.0.7/gfs2/tool/../libgfs2/libgfs2.h: there isn't a
GFS2 filesystem on /dev/sde1
urania# gfs2_edit -p sb /dev/sde1
bad seek: Invalid argument from gfs2_load_inode:416: block
3747350044811107074 (0x34014302ee029b02)
Is there an alternate superblock that I can use to mount the disk to at
least get the last couple of days of data off of it?
--
Dr. Douglas O'Neal
Manager, Bioinformatics Core Center
Center for Bioinformatics & Computational Biology
Delaware Biotechnology Institute
University of Delaware
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster