Hi All,
We have a RAID system with file system issues as follows,
50 TB in RAID 6 hosted on an Adaptec 71605 controller using WD4000FYYZ
drives.
Centos 6.7 2.6.32-642.el6.x86_64 : xfsprogs-3.1.1-16.el6
While rebuilding a replaced disk, with the file system online and in
use, the system logs showed multiple entries of;
XFS (sde): Corruption detected. Unmount and run xfs_repair.
[See also at the end of post for a section of XFS related errors in the log]
I unmounted the filesystem and waited for the controller to finish
rebuilding the array. I then moved the most important data to another
RAID array on a different server. The data is generated from HPC
simulations and is not backed up but can be regenerated in needed.
The default el6 "xfs_repair" is in "xfsprogs-3.1.1-16.el6". I notice
that the "elrepo_testing" repository has a much later version of
"xfsprogs" namely
xfsprogs.x86_64 4.3.0-1.el6.elrepo
As far as I understand the user based tools are backwards compatible so
would it be better to use the "4.3" release of "xfsprogs"instead of the
default "3.1.1" included in the installation of el6?
I ran an "xfs_repair -nv /dev/sde" for both "3.1.1" and "4.3" and both
completed successfully showing the repairs that would have taken place.
I can post these if requested.
The "3.1.1" version of "xfs_repair -n" ran in 1 minute, 32 seconds
The "4.3" version of "xfs_repair -n" ran in 50 seconds
So my questions are
[1] Which version of "xfs_repair" should I use to make the repair?
[2] Is there anything I should have done differently?
Many thanks for any advice given it is much appreciated.
Thanks, Steve
Many blocks (about 20) of code similar to this were repeated in the logs.
Jul 8 18:40:17 sraid1v kernel: ffff880dca95b000: 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 ................
Jul 8 18:40:17 sraid1v kernel: XFS (sde): Internal error
xfs_da_do_buf(2) at line 2136 of file fs/xfs/xfs_da_btree.c. Caller
0xffffffffa0e6e81a
Jul 8 18:40:17 sraid1v kernel:
Jul 8 18:40:17 sraid1v kernel: Pid: 8844, comm: idl Tainted:
P -- ------------ 2.6.32-642.el6.x86_64 #1
Jul 8 18:40:17 sraid1v kernel: Call Trace:
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e7b68f>] ?
xfs_error_report+0x3f/0x50 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e6e81a>] ?
xfs_da_read_buf+0x2a/0x30 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e7b6fe>] ?
xfs_corruption_error+0x5e/0x90 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e6e6fc>] ?
xfs_da_do_buf+0x6cc/0x770 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e6e81a>] ?
xfs_da_read_buf+0x2a/0x30 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffff810154e3>] ?
native_sched_clock+0x13/0x80
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e6e81a>] ?
xfs_da_read_buf+0x2a/0x30 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e74a21>] ?
xfs_dir2_leaf_lookup_int+0x61/0x2c0 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e74a21>] ?
xfs_dir2_leaf_lookup_int+0x61/0x2c0 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e74e05>] ?
xfs_dir2_leaf_lookup+0x35/0xf0 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e71306>] ?
xfs_dir2_isleaf+0x26/0x60 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e71ce4>] ?
xfs_dir_lookup+0x174/0x190 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e9ea47>] ?
xfs_lookup+0x87/0x110 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0eabd74>] ?
xfs_vn_lookup+0x54/0xa0 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffff811a9ca5>] ? do_lookup+0x1a5/0x230
Jul 8 18:40:17 sraid1v kernel: [<ffffffff811aa823>] ?
__link_path_walk+0x763/0x1060
Jul 8 18:40:17 sraid1v kernel: [<ffffffff811ab3da>] ? path_walk+0x6a/0xe0
Jul 8 18:40:17 sraid1v kernel: [<ffffffff811ab5eb>] ?
filename_lookup+0x6b/0xc0
Jul 8 18:40:17 sraid1v kernel: [<ffffffff8123ac46>] ?
security_file_alloc+0x16/0x20
Jul 8 18:40:17 sraid1v kernel: [<ffffffff811acac4>] ?
do_filp_open+0x104/0xd20
Jul 8 18:40:17 sraid1v kernel: [<ffffffffa0e9a4fc>] ?
_xfs_trans_commit+0x25c/0x310 [xfs]
Jul 8 18:40:17 sraid1v kernel: [<ffffffff812a749a>] ?
strncpy_from_user+0x4a/0x90
Jul 8 18:40:17 sraid1v kernel: [<ffffffff811ba252>] ? alloc_fd+0x92/0x160
Jul 8 18:40:17 sraid1v kernel: [<ffffffff81196bd7>] ?
do_sys_open+0x67/0x130
Jul 8 18:40:17 sraid1v kernel: [<ffffffff81196ce0>] ? sys_open+0x20/0x30
Jul 8 18:40:17 sraid1v kernel: [<ffffffff8100b0d2>] ?
system_call_fastpath+0x16/0x1b
Jul 8 18:40:17 sraid1v kernel: XFS (sde): Corruption detected. Unmount
and run xfs_repair
Jul 8 18:40:17 sraid1v kernel: ffff880dca95b000: 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 ................
Jul 8 18:40:17 sraid1v kernel: XFS (sde): Internal error
xfs_da_do_buf(2) at line 2136 of file fs/xfs/xfs_da_btree.c. Caller
0xffffffffa0e6e81a
Jul 8 18:40:17 sraid1v kernel:
Jul 8 18:40:17 sraid1v kernel: Pid: 8844, comm: idl Tainted:
P -- ------------ 2.6.32-642.el6.x86_64 #1
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs