Advice needed with file system corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

We have a RAID system with file system issues as follows,

50 TB in RAID 6 hosted on an Adaptec 71605 controller using WD4000FYYZ drives.

Centos 6.7  2.6.32-642.el6.x86_64   :   xfsprogs-3.1.1-16.el6

While rebuilding a replaced disk, with the file system online and in use, the system logs showed multiple entries of;

XFS (sde): Corruption detected. Unmount and run xfs_repair.

[See also at the end of post for a section of XFS related errors in the log]

I unmounted the filesystem and waited for the controller to finish rebuilding the array. I then moved the most important data to another RAID array on a different server. The data is generated from HPC simulations and is not backed up but can be regenerated in needed.

The default el6 "xfs_repair" is in "xfsprogs-3.1.1-16.el6". I notice that the "elrepo_testing" repository has a much later version of "xfsprogs" namely

 xfsprogs.x86_64 4.3.0-1.el6.elrepo

As far as I understand the user based tools are backwards compatible so would it be better to use the "4.3" release of "xfsprogs"instead of the default "3.1.1" included in the installation of el6?

I ran an "xfs_repair -nv /dev/sde" for both "3.1.1" and "4.3" and both completed successfully showing the repairs that would have taken place. I can post these if requested.

The "3.1.1"  version of "xfs_repair -n" ran in 1 minute, 32 seconds

The "4.3"     version of "xfs_repair -n" ran in 50 seconds


So my questions are

[1] Which version of "xfs_repair" should I use to make the repair?

[2] Is there anything I should have done differently?


Many thanks for any advice given it is much appreciated.

Thanks,  Steve



Many blocks (about 20) of code similar to this were repeated in the logs.

Jul 8 18:40:17 sraid1v kernel: ffff880dca95b000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jul 8 18:40:17 sraid1v kernel: XFS (sde): Internal error xfs_da_do_buf(2) at line 2136 of file fs/xfs/xfs_da_btree.c. Caller 0xffffffffa0e6e81a
Jul  8 18:40:17 sraid1v kernel:
Jul 8 18:40:17 sraid1v kernel: Pid: 8844, comm: idl Tainted: P -- ------------ 2.6.32-642.el6.x86_64 #1
Jul  8 18:40:17 sraid1v kernel: Call Trace:
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e7b68f>] ? xfs_error_report+0x3f/0x50 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e6e81a>] ? xfs_da_read_buf+0x2a/0x30 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e7b6fe>] ? xfs_corruption_error+0x5e/0x90 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e6e6fc>] ? xfs_da_do_buf+0x6cc/0x770 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e6e81a>] ? xfs_da_read_buf+0x2a/0x30 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffff810154e3>] ? native_sched_clock+0x13/0x80 Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e6e81a>] ? xfs_da_read_buf+0x2a/0x30 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e74a21>] ? xfs_dir2_leaf_lookup_int+0x61/0x2c0 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e74a21>] ? xfs_dir2_leaf_lookup_int+0x61/0x2c0 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e74e05>] ? xfs_dir2_leaf_lookup+0x35/0xf0 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e71306>] ? xfs_dir2_isleaf+0x26/0x60 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e71ce4>] ? xfs_dir_lookup+0x174/0x190 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e9ea47>] ? xfs_lookup+0x87/0x110 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0eabd74>] ? xfs_vn_lookup+0x54/0xa0 [xfs]
Jul  8 18:40:17 sraid1v kernel: [<ffffffff811a9ca5>] ? do_lookup+0x1a5/0x230
Jul 8 18:40:17 sraid1v kernel: [<ffffffff811aa823>] ? __link_path_walk+0x763/0x1060
Jul  8 18:40:17 sraid1v kernel: [<ffffffff811ab3da>] ? path_walk+0x6a/0xe0
Jul 8 18:40:17 sraid1v kernel: [<ffffffff811ab5eb>] ? filename_lookup+0x6b/0xc0 Jul 8 18:40:17 sraid1v kernel: [<ffffffff8123ac46>] ? security_file_alloc+0x16/0x20 Jul 8 18:40:17 sraid1v kernel: [<ffffffff811acac4>] ? do_filp_open+0x104/0xd20 Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e9a4fc>] ? _xfs_trans_commit+0x25c/0x310 [xfs] Jul 8 18:40:17 sraid1v kernel: [<ffffffff812a749a>] ? strncpy_from_user+0x4a/0x90
Jul  8 18:40:17 sraid1v kernel: [<ffffffff811ba252>] ? alloc_fd+0x92/0x160
Jul 8 18:40:17 sraid1v kernel: [<ffffffff81196bd7>] ? do_sys_open+0x67/0x130
Jul  8 18:40:17 sraid1v kernel: [<ffffffff81196ce0>] ? sys_open+0x20/0x30
Jul 8 18:40:17 sraid1v kernel: [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b Jul 8 18:40:17 sraid1v kernel: XFS (sde): Corruption detected. Unmount and run xfs_repair Jul 8 18:40:17 sraid1v kernel: ffff880dca95b000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jul 8 18:40:17 sraid1v kernel: XFS (sde): Internal error xfs_da_do_buf(2) at line 2136 of file fs/xfs/xfs_da_btree.c. Caller 0xffffffffa0e6e81a
Jul  8 18:40:17 sraid1v kernel:
Jul 8 18:40:17 sraid1v kernel: Pid: 8844, comm: idl Tainted: P -- ------------ 2.6.32-642.el6.x86_64 #1







_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux