On Sat, Oct 3, 2009 at 6:13 AM, Nicolas Ferré <nicolas.ferre@xxxxxxxxxxxxxxxx> wrote:
Hi,
We have a problem with our cluster, a gfs2 fs cannot be accessed some times after the system reboot. I have to manually umount/mount it.
Here is the relevant part of /var/log/messages:
Oct 3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1: fatal: invalid metadata block
Oct 3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1: bh = 114419123 (magic number)
Oct 3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1: function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 334
Oct 3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1: about to withdraw this file system
Oct 3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1: telling LM to withdraw
Oct 3 11:46:16 slater kernel: VFS:Filesystem freeze failed
Oct 3 11:46:22 slater snmpd[7344]: Connection from UDP: [127.0.0.1]:58640
Oct 3 11:46:22 slater snmpd[7344]: Received SNMP packet(s) from UDP: [127.0.0.1]:58640
Oct 3 11:46:37 slater snmpd[7344]: Connection from UDP: [127.0.0.1]:47125
Oct 3 11:46:37 slater snmpd[7344]: Received SNMP packet(s) from UDP: [127.0.0.1]:47125
Oct 3 11:46:53 slater snmpd[7344]: Connection from UDP: [127.0.0.1]:33910
Oct 3 11:46:53 slater snmpd[7344]: Received SNMP packet(s) from UDP: [127.0.0.1]:33910
Oct 3 11:46:53 slater kernel: dlm: home: group leave failed -512 0
Oct 3 11:46:53 slater kernel: GFS2: fsid=crcmm:home.1: withdrawn
Oct 3 11:46:53 slater kernel:
Oct 3 11:46:53 slater kernel: Call Trace:
Oct 3 11:46:53 slater kernel: [<ffffffff8863a3ce>] :gfs2:gfs2_lm_withdraw+0xc1/0xd0
Oct 3 11:46:53 slater kernel: [<ffffffff80063a06>] __wait_on_bit+0x60/0x6e
Oct 3 11:46:53 slater kernel: [<ffffffff800153ac>] sync_buffer+0x0/0x3f
Oct 3 11:46:53 slater kernel: [<ffffffff80063a80>] out_of_line_wait_on_bit+0x6c/0x78
Oct 3 11:46:53 slater kernel: [<ffffffff8009f6ef>] wake_bit_function+0x0/0x23
Oct 3 11:46:53 slater kernel: [<ffffffff8001a7ac>] submit_bh+0x10a/0x111
Oct 3 11:46:53 slater kernel: [<ffffffff8864d547>] :gfs2:gfs2_meta_check_ii+0x2c/0x38
Oct 3 11:46:53 slater kernel: [<ffffffff8863de01>] :gfs2:gfs2_meta_indirect_buffer+0x104/0x15f
Oct 3 11:46:53 slater kernel: [<ffffffff8863d993>] :gfs2:gfs2_getbuf+0x106/0x115
Oct 3 11:46:53 slater kernel: [<ffffffff8862e786>] :gfs2:recursive_scan+0x96/0x175
Oct 3 11:46:53 slater kernel: [<ffffffff8862e82c>] :gfs2:recursive_scan+0x13c/0x175
Oct 3 11:46:53 slater kernel: [<ffffffff8862f6bc>] :gfs2:do_strip+0x0/0x349
Oct 3 11:46:53 slater kernel: [<ffffffff8862e8fe>] :gfs2:trunc_dealloc+0x99/0xe7
Oct 3 11:46:53 slater kernel: [<ffffffff8862f6bc>] :gfs2:do_strip+0x0/0x349
Oct 3 11:46:53 slater kernel: [<ffffffff88645dd2>] :gfs2:gfs2_delete_inode+0xdd/0x191
Oct 3 11:46:53 slater kernel: [<ffffffff88645d3b>] :gfs2:gfs2_delete_inode+0x46/0x191
Oct 3 11:46:53 slater kernel: [<ffffffff88635e77>] :gfs2:gfs2_glock_schedule_for_reclaim+0x5d/0x9a
Oct 3 11:46:53 slater kernel: [<ffffffff88645cf5>] :gfs2:gfs2_delete_inode+0x0/0x191
Oct 3 11:46:53 slater kernel: [<ffffffff8002f49e>] generic_delete_inode+0xc6/0x143
Oct 3 11:46:53 slater kernel: [<ffffffff8864a99c>] :gfs2:gfs2_inplace_reserve_i+0x63b/0x691
Oct 3 11:46:53 slater kernel: [<ffffffff80021f3f>] __up_read+0x19/0x7f
Oct 3 11:46:53 slater kernel: [<ffffffff88635dd8>] :gfs2:do_promote+0xf5/0x137
Oct 3 11:46:53 slater kernel: [<ffffffff8863f24a>] :gfs2:gfs2_write_begin+0x16c/0x339
Oct 3 11:46:53 slater kernel: [<ffffffff88640a7b>] :gfs2:gfs2_file_buffered_write+0xf3/0x26c
Oct 3 11:46:53 slater kernel: [<ffffffff88640e4c>] :gfs2:__gfs2_file_aio_write_nolock+0x258/0x28f
Oct 3 11:46:53 slater kernel: [<ffffffff88640fee>] :gfs2:gfs2_file_write_nolock+0xaa/0x10f
Oct 3 11:46:54 slater kernel: [<ffffffff800c5145>] generic_file_read+0xac/0xc5
Oct 3 11:46:54 slater kernel: [<ffffffff8009f6c1>] autoremove_wake_function+0x0/0x2e
Oct 3 11:46:54 slater kernel: [<ffffffff88635e77>] :gfs2:gfs2_glock_schedule_for_reclaim+0x5d/0x9a
Oct 3 11:46:54 slater kernel: [<ffffffff8009f6c1>] autoremove_wake_function+0x0/0x2e
Oct 3 11:46:54 slater kernel: [<ffffffff8864113e>] :gfs2:gfs2_file_write+0x49/0xa7
Oct 3 11:46:54 slater kernel: [<ffffffff80016927>] vfs_write+0xce/0x174
Oct 3 11:46:54 slater kernel: [<ffffffff800171df>] sys_write+0x45/0x6e
Oct 3 11:46:54 slater kernel: [<ffffffff8006149d>] sysenter_do_call+0x1e/0x6a
Oct 3 11:46:54 slater kernel:
Oct 3 11:46:54 slater kernel: GFS2: fsid=crcmm:home.1: gfs2_delete_inode: -5
Can someone explain the meaning of such messages? And how to cure the problem ...
Regards,
--
Nicolas Ferre'
Laboratoire Chimie Provence
Universite' de Provence - France
Tel: +33 491282733
http://sites.univ-provence.fr/lcp-ct
Nicholas,
Any time you see a gfs/gfs2 filesystem withdrawn message do yourself a favor and do an fsck of the fileystem.
These to links might explain some of what your are seeing especially after you run an fsck.
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Global_File_System/s1-manage-gfswithdraw.html
https://bugzilla.redhat.com/show_bug.cgi?id=210367
Any time you see a gfs/gfs2 filesystem withdrawn message do yourself a favor and do an fsck of the fileystem.
These to links might explain some of what your are seeing especially after you run an fsck.
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Global_File_System/s1-manage-gfswithdraw.html
https://bugzilla.redhat.com/show_bug.cgi?id=210367
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster