----- Original Message ----- | 8 node cluster, fiber channel hbas and disks access trough a qlogic | fabric. | | I've got hit 3 times with this error on different nodes : | | GFS2: fsid=CyberCluster:GizServer.1: fatal: filesystem consistency | error | GFS2: fsid=CyberCluster:GizServer.1: inode = 9582 6698267 | GFS2: fsid=CyberCluster:GizServer.1: function = gfs2_dinode_dealloc, | file = | fs/gfs2/inode.c, line = 352 | GFS2: fsid=CyberCluster:GizServer.1: about to withdraw this file | system | GFS2: fsid=CyberCluster:GizServer.1: telling LM to unmount | GFS2: fsid=CyberCluster:GizServer.1: withdrawn | Pid: 2659, comm: delete_workqueu Tainted: G W ---------------- T | 2.6.32-131.2.1.el6.x86_64 #1 | Call Trace: | [<ffffffffa044ffd2>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2] | [<ffffffffa0425209>] ? trunc_dealloc+0xa9/0x130 [gfs2] | [<ffffffffa04501dd>] ? gfs2_consist_inode_i+0x5d/0x60 [gfs2] | [<ffffffffa0435584>] ? gfs2_dinode_dealloc+0x64/0x210 [gfs2] | [<ffffffffa044e1da>] ? gfs2_delete_inode+0x1ba/0x280 [gfs2] | [<ffffffffa044e0ad>] ? gfs2_delete_inode+0x8d/0x280 [gfs2] | [<ffffffffa044e020>] ? gfs2_delete_inode+0x0/0x280 [gfs2] | [<ffffffff8118cfbe>] ? generic_delete_inode+0xde/0x1d0 | [<ffffffffa0432940>] ? delete_work_func+0x0/0x80 [gfs2] | [<ffffffff8118d115>] ? generic_drop_inode+0x65/0x80 | [<ffffffffa044cc4e>] ? gfs2_drop_inode+0x2e/0x30 [gfs2] | [<ffffffff8118bf82>] ? iput+0x62/0x70 | [<ffffffffa0432994>] ? delete_work_func+0x54/0x80 [gfs2] | [<ffffffff810887d0>] ? worker_thread+0x170/0x2a0 | [<ffffffff8108e100>] ? autoremove_wake_function+0x0/0x40 | [<ffffffff81088660>] ? worker_thread+0x0/0x2a0 | [<ffffffff8108dd96>] ? kthread+0x96/0xa0 | [<ffffffff8100c1ca>] ? child_rip+0xa/0x20 | [<ffffffff8108dd00>] ? kthread+0x0/0xa0 | [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 | no_formal_ino = 9582 | no_addr = 6698267 | i_disksize = 6838 | blocks = 0 | i_goal = 6698304 | i_diskflags = 0x00000000 | i_height = 1 | i_depth = 0 | i_entries = 0 | i_eattr = 0 | GFS2: fsid=CyberCluster:GizServer.1: gfs2_delete_inode: -5 | gdlm_unlock 5,66351b err=-22 | | | Only, with different inodes each time. | | After that event, services running on that filesystem are marked | failed and | not moved over another node. Any access to that fs yields I/O error. | Server | needed to be rebooted to properly work again. | | I did ran a fsck last night on that filesystem, and it did find some | errors, | but nothing serious. Lots (realy lots) of those : | | Ondisk and fsck bitmaps differ at block 5771602 (0x581152) | Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free) | Metadata type is 0 (free) | Fix bitmap for block 5771602 (0x581152) ? (y/n) | | And after completing the fsck, I started back some services, and I got | the | same error on another filesystem that is practily empty and used for | small | utilities used troughout the cluster... | | What should I do to find the source of this problem ? Hi, I believe this is a GFS2 bug we've already solved. Please contact Red Hat Support. Regards, Bob Peterson Red Hat File Systems -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster