Hello, We run some troubles since several days on our GFS2 (log attached): - we ran the FS for some times without troubles (since 2014-11-03) - the FS was grown from 3To to 4To near 6 month ago - it seems to happen only on one node “nebula3” - I run an FSCK when just fencing the node was not sufficient (2 crashes the same day) The nodes run on Ubuntu Trusty Thar up to date. Do you have any idea? Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF
Feb 10 09:08:08 nebula3 kernel: [53799.437568] GFS2: buf_blk = 0x3248 old_state=0, new_state=0 Feb 10 09:08:08 nebula3 kernel: [53799.437577] GFS2: rgrp=0x3ff67bbd bi_start=0x0 Feb 10 09:08:08 nebula3 kernel: [53799.437579] GFS2: bi_offset=0x80 bi_len=0xf80 Feb 10 09:08:08 nebula3 kernel: [53799.437585] CPU: 9 PID: 48112 Comm: rm Not tainted 3.13.0-77-generic #121-Ubuntu Feb 10 09:08:08 nebula3 kernel: [53799.437588] Hardware name: Dell Inc. PowerEdge M620/0T36VK, BIOS 2.2.7 01/21/2014 Feb 10 09:08:08 nebula3 kernel: [53799.437591] 000000003ff6ae0b ffff8816d6421af0 ffffffff81725138 000000003ff6ae0b Feb 10 09:08:08 nebula3 kernel: [53799.437599] ffff8816d6421b48 ffffffffa05b0bbf ffff8817cb61f100 00000000a05b7977 Feb 10 09:08:08 nebula3 kernel: [53799.437605] ffff8817cb638198 0000000000003248 ffff8816d671f000 0000000000000020 Feb 10 09:08:08 nebula3 kernel: [53799.437611] Call Trace: Feb 10 09:08:08 nebula3 kernel: [53799.437629] [<ffffffff81725138>] dump_stack+0x45/0x56 Feb 10 09:08:08 nebula3 kernel: [53799.437650] [<ffffffffa05b0bbf>] rgblk_free+0x1ff/0x230 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437663] [<ffffffffa05b2f34>] __gfs2_free_blocks+0x34/0x120 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437671] [<ffffffffa058f076>] recursive_scan+0x5b6/0x6a0 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437679] [<ffffffffa058ef2c>] recursive_scan+0x46c/0x6a0 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437691] [<ffffffffa05ad4f5>] ? gfs2_quota_hold+0x175/0x1f0 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437699] [<ffffffffa058f25a>] trunc_dealloc+0xfa/0x120 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437708] [<ffffffffa059a98e>] ? gfs2_glock_wait+0x3e/0x80 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437718] [<ffffffffa059c190>] ? gfs2_glock_nq+0x280/0x430 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437726] [<ffffffffa0590ef0>] gfs2_file_dealloc+0x10/0x20 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437737] [<ffffffffa05b3db3>] gfs2_evict_inode+0x2b3/0x3e0 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437746] [<ffffffffa05b3c13>] ? gfs2_evict_inode+0x113/0x3e0 [gfs2] Feb 10 09:08:08 nebula3 kernel: [53799.437755] [<ffffffff811d99b0>] evict+0xb0/0x1b0 Feb 10 09:08:08 nebula3 kernel: [53799.437760] [<ffffffff811da1c5>] iput+0xf5/0x180 Feb 10 09:08:08 nebula3 kernel: [53799.437767] [<ffffffff811ceb1e>] do_unlinkat+0x18e/0x2b0 Feb 10 09:08:08 nebula3 kernel: [53799.437775] [<ffffffff811bbb76>] ? filp_close+0x56/0x70 Feb 10 09:08:08 nebula3 kernel: [53799.437780] [<ffffffff811cfa4b>] SyS_unlinkat+0x1b/0x40 Feb 10 09:08:08 nebula3 kernel: [53799.437788] [<ffffffff81735d1d>] system_call_fastpath+0x1a/0x1f Feb 10 09:08:08 nebula3 kernel: [53799.437794] GFS2: fsid=yggdrasil:datastores.2: fatal: filesystem consistency error Feb 10 09:08:08 nebula3 kernel: [53799.437794] GFS2: fsid=yggdrasil:datastores.2: RG = 1073118141 Feb 10 09:08:08 nebula3 kernel: [53799.437794] GFS2: fsid=yggdrasil:datastores.2: function = gfs2_setbit, file = /build/linux-faWYrf/linux-3.13.0/fs/gfs2/rgrp.c, line = 103 Feb 10 09:08:08 nebula3 kernel: [53799.437797] GFS2: fsid=yggdrasil:datastores.2: about to withdraw this file system Feb 10 09:08:08 nebula3 kernel: [53799.441715] GFS2: fsid=yggdrasil:datastores.2: gfs2_evict_inode: -5 Feb 10 09:08:10 nebula3 kernel: [53801.764726] GFS2: fsid=yggdrasil:datastores.2: dirty_inode: glock -5 Feb 10 09:08:11 nebula3 kernel: [53802.249691] GFS2: fsid=yggdrasil:datastores.2: dirty_inode: glock -5 Feb 10 09:08:11 nebula3 kernel: [53802.254133] GFS2: fsid=yggdrasil:datastores.2: dirty_inode: glock -5 Feb 10 09:08:12 nebula3 kernel: [53803.330583] GFS2: fsid=yggdrasil:datastores.2: dirty_inode: glock -5 [...] Node restarted Feb 10 11:17:05 nebula3 kernel: [ 6703.936206] GFS2: fsid=yggdrasil:datastores.2: fatal: filesystem consistency error Feb 10 11:17:05 nebula3 kernel: [ 6703.936206] GFS2: fsid=yggdrasil:datastores.2: inode = 11514 30312500 Feb 10 11:17:05 nebula3 kernel: [ 6703.936206] GFS2: fsid=yggdrasil:datastores.2: function = gfs2_dinode_dealloc, file = /build/linux-OTIHGI/linux-3.13.0/fs/gfs2/super.c, line = 1371 Feb 10 11:17:05 nebula3 kernel: [ 6703.936216] GFS2: fsid=yggdrasil:datastores.2: about to withdraw this file system Feb 10 11:17:05 nebula3 kernel: [ 6703.975181] GFS2: fsid=yggdrasil:datastores.2: dirty_inode: glock -5 Feb 10 11:17:05 nebula3 kernel: [ 6704.073107] GFS2: fsid=yggdrasil:datastores.2: dirty_inode: glock -5 Feb 10 11:17:05 nebula3 kernel: [ 6704.076098] GFS2: fsid=yggdrasil:datastores.2: dirty_inode: glock -5 Feb 10 11:17:05 nebula3 kernel: [ 6704.078946] GFS2: fsid=yggdrasil:datastores.2: dirty_inode: glock -5 [...] All node down + fsck.gfs2 on the FS [...] 5 days later Feb 15 09:22:27 nebula3 kernel: [411282.308290] GFS2: buf_blk = 0x2089 old_state=0, new_state=0 Feb 15 09:22:27 nebula3 kernel: [411282.308295] GFS2: rgrp=0xc0c5667 bi_start=0x0 Feb 15 09:22:27 nebula3 kernel: [411282.308296] GFS2: bi_offset=0x80 bi_len=0xf80 Feb 15 09:22:27 nebula3 kernel: [411282.308300] CPU: 9 PID: 11494 Comm: rm Tainted: G W 3.13.0-78-generic #122-Ubuntu Feb 15 09:22:27 nebula3 kernel: [411282.308301] Hardware name: Dell Inc. PowerEdge M620/0T36VK, BIOS 2.2.7 01/21/2014 Feb 15 09:22:27 nebula3 kernel: [411282.308303] 000000000c0c7705 ffff882e055f9a48 ffffffff81725768 000000000c0c76f6 Feb 15 09:22:27 nebula3 kernel: [411282.308309] ffff882e055f9aa0 ffffffffa05bcbbf ffff8817de436e00 00000000a05c3977 Feb 15 09:22:27 nebula3 kernel: [411282.308312] ffff8817de414d48 0000000000002089 ffff882d6f2ff000 0000000000000010 Feb 15 09:22:27 nebula3 kernel: [411282.308315] Call Trace: Feb 15 09:22:27 nebula3 kernel: [411282.308327] [<ffffffff81725768>] dump_stack+0x45/0x56 Feb 15 09:22:27 nebula3 kernel: [411282.308340] [<ffffffffa05bcbbf>] rgblk_free+0x1ff/0x230 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308348] [<ffffffffa05bef34>] __gfs2_free_blocks+0x34/0x120 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308352] [<ffffffffa059b076>] recursive_scan+0x5b6/0x6a0 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308356] [<ffffffffa059af2c>] recursive_scan+0x46c/0x6a0 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308360] [<ffffffffa059af2c>] recursive_scan+0x46c/0x6a0 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308367] [<ffffffffa05b94f5>] ? gfs2_quota_hold+0x175/0x1f0 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308371] [<ffffffffa059b25a>] trunc_dealloc+0xfa/0x120 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308377] [<ffffffffa05a698e>] ? gfs2_glock_wait+0x3e/0x80 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308382] [<ffffffffa05a8190>] ? gfs2_glock_nq+0x280/0x430 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308387] [<ffffffffa059cef0>] gfs2_file_dealloc+0x10/0x20 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308393] [<ffffffffa05bfdb3>] gfs2_evict_inode+0x2b3/0x3e0 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308398] [<ffffffffa05bfc13>] ? gfs2_evict_inode+0x113/0x3e0 [gfs2] Feb 15 09:22:27 nebula3 kernel: [411282.308403] [<ffffffff811d9a40>] evict+0xb0/0x1b0 Feb 15 09:22:27 nebula3 kernel: [411282.308406] [<ffffffff811da255>] iput+0xf5/0x180 Feb 15 09:22:27 nebula3 kernel: [411282.308410] [<ffffffff811cebae>] do_unlinkat+0x18e/0x2b0 Feb 15 09:22:27 nebula3 kernel: [411282.308415] [<ffffffff811bbc06>] ? filp_close+0x56/0x70 Feb 15 09:22:27 nebula3 kernel: [411282.308418] [<ffffffff811cfadb>] SyS_unlinkat+0x1b/0x40 Feb 15 09:22:27 nebula3 kernel: [411282.308421] [<ffffffff8173635d>] system_call_fastpath+0x1a/0x1f Feb 15 09:22:27 nebula3 kernel: [411282.308424] GFS2: fsid=yggdrasil:datastores.1: fatal: filesystem consistency error Feb 15 09:22:27 nebula3 kernel: [411282.308424] GFS2: fsid=yggdrasil:datastores.1: RG = 202135143 Feb 15 09:22:27 nebula3 kernel: [411282.308424] GFS2: fsid=yggdrasil:datastores.1: function = gfs2_setbit, file = /build/linux-OTIHGI/linux-3.13.0/fs/gfs2/rgrp.c, line = 103 Feb 15 09:22:27 nebula3 kernel: [411282.308426] GFS2: fsid=yggdrasil:datastores.1: about to withdraw this file system Feb 15 09:22:27 nebula3 kernel: [411282.483258] GFS2: fsid=yggdrasil:datastores.1: dirty_inode: glock -5 Feb 15 09:22:27 nebula3 kernel: [411282.627372] GFS2: fsid=yggdrasil:datastores.1: dirty_inode: glock -5 Feb 15 09:22:27 nebula3 kernel: [411282.876874] GFS2: fsid=yggdrasil:datastores.1: dirty_inode: glock -5 Feb 15 09:22:27 nebula3 kernel: [411282.879708] GFS2: fsid=yggdrasil:datastores.1: dirty_inode: glock -5 Feb 15 09:22:28 nebula3 kernel: [411283.383218] GFS2: fsid=yggdrasil:datastores.1: dirty_inode: glock -5 Feb 15 09:22:28 nebula3 kernel: [411283.397423] GFS2: fsid=yggdrasil:datastores.1: dirty_inode: glock -5 Feb 15 09:22:28 nebula3 kernel: [411283.399253] GFS2: fsid=yggdrasil:datastores.1: dirty_inode: glock -5
Attachment:
signature.asc
Description: PGP signature
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster