I have a 5 node cluster running kernel 2.6.18-128.1.6.el5xen and
gfs2-utils-0.1.53-1.el5_3.3 . Twice in 10 days, each node in my cluster
has failed with the same message in /var/log/messages. dmesg reports the
same errors, and on some nodes there are no other entries previous to
the invalid metadata block error.
I would like to know what issues can trigger such an event. If it is
more helpful for me to provide more information, I will be happy to, I'm
just not sure what other information you would consider relevant.
Thank you for your time,
-Kai Meyer
Sep 19 02:02:06 192.168.100.104 kernel: GFS2:
fsid=xencluster1:xenclusterfs1.1: fatal: invalid metadata block
Sep 19 02:02:06 192.168.100.104 kernel: GFS2:
fsid=xencluster1:xenclusterfs1.1: bh = 567447963 (magic number)
Sep 19 02:02:06 192.168.100.104 kernel: GFS2:
fsid=xencluster1:xenclusterfs1.1: function =
gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line
= 334
Sep 19 02:02:06 192.168.100.104 kernel: GFS2:
fsid=xencluster1:xenclusterfs1.1: about to withdraw this file system
Sep 19 02:02:06 192.168.100.104 kernel: GFS2:
fsid=xencluster1:xenclusterfs1.1: telling LM to withdraw
Sep 19 02:02:07 192.168.100.104 kernel: GFS2:
fsid=xencluster1:xenclusterfs1.1: withdrawn
Sep 19 02:02:07 192.168.100.104 kernel:
Sep 19 02:02:07 192.168.100.104 kernel: Call Trace:
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff885154ce>]
:gfs2:gfs2_lm_withdraw+0xc1/0xd0
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff80262907>]
__wait_on_bit+0x60/0x6e
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff80215788>]
sync_buffer+0x0/0x3f
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff80262981>]
out_of_line_wait_on_bit+0x6c/0x78
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8029a01a>]
wake_bit_function+0x0/0x23
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8021a7f1>]
submit_bh+0x10a/0x111
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff885284a7>]
:gfs2:gfs2_meta_check_ii+0x2c/0x38
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff88518d30>]
:gfs2:gfs2_meta_indirect_buffer+0x104/0x160
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff88509fc3>]
:gfs2:gfs2_block_map+0x1dc/0x33e
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8021a821>]
poll_freewait+0x29/0x6a
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8850a199>]
:gfs2:gfs2_extent_map+0x74/0xac
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8850a2ce>]
:gfs2:gfs2_write_alloc_required+0xfd/0x122
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff885128d5>]
:gfs2:gfs2_glock_nq+0x248/0x273
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8851a27c>]
:gfs2:gfs2_write_begin+0x99/0x36a
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8851bd1b>]
:gfs2:gfs2_file_buffered_write+0x14b/0x2e5
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8020d3a5>]
file_read_actor+0x0/0xfc
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8851c151>]
:gfs2:__gfs2_file_aio_write_nolock+0x29c/0x2d4
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8851c2f4>]
:gfs2:gfs2_file_write_nolock+0xaa/0x10f
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8022eca0>]
__wake_up+0x38/0x4f
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff80299fec>]
autoremove_wake_function+0x0/0x2e
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8022fbe4>]
pipe_readv+0x38e/0x3a2
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff80263bce>]
lock_kernel+0x1b/0x32
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8851c444>]
:gfs2:gfs2_file_write+0x49/0xa7
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff80216da9>]
vfs_write+0xce/0x174
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff802175e1>]
sys_write+0x45/0x6e
Sep 19 02:02:07 192.168.100.104 kernel: [<ffffffff8025f2f9>]
tracesys+0xab/0xb6
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster