Re: XFS File system in trouble

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's failing, again. The rsync job failed and when I attempt to untar the file in the image mount, it fails there, as well. See below. I formatted a 1.5T drive as xfs and mounted it under /media. I then dumped the failing FS to a file on /media using xfs_metadump and used xfs_mdrestore to create an image of the FS. I then mounted the image, copied over the tarball to its location, and ran tar to extract the files:

RAID-Server:/# mount -o nouuid /media/md0.img /TEST

RAID-Server:/# cd "/TEST/Server-Main/Equipment/Drive Controllers/HighPoint Adapters/Rocket 2722/Driver"/

RAID-Server:/TEST/Server-Main/Equipment/Drive Controllers/HighPoint Adapters/Rocket 2722/Driver# cp "/RAID/Server-Main/Equipment/Drive Controllers/HighPoint Adapters/Rocket 2722/Driver/RR_27xx.tar.gz" ./

RAID-Server:/TEST/Server-Main/Equipment/Drive Controllers/HighPoint Adapters/Rocket 2722/Driver# tar -xzvf RR_27xx.tar.gz
DC7280/
DC7280/Linux/
DC7280/Linux/Opensource/
DC7280/Linux/Opensource/DC7280-linux-src-v1.0-110621-1313.tar.gz
DC7280/Windows/
DC7280/Windows/Vista-Win2008-Win7/
DC7280/Windows/Vista-Win2008-Win7/x32/
DC7280/Windows/Vista-Win2008-Win7/x32/dc7280.cat
DC7280/Windows/Vista-Win2008-Win7/x32/dc7280.inf
DC7280/Windows/Vista-Win2008-Win7/x32/dc7280.sys
DC7280/Windows/Vista-Win2008-Win7/x64/
DC7280/Windows/Vista-Win2008-Win7/x64/dc7280.cat
DC7280/Windows/Vista-Win2008-Win7/x64/dc7280.inf
DC7280/Windows/Vista-Win2008-Win7/x64/dc7280.sys
DC7280/Windows/Vista-Win2008-Win7/Readme.txt
DC7280/.ddinfo
R272x/
R272x/Linux/
R272x/Linux/Opensource/
R272x/Linux/Opensource/partial/
R272x/Linux/Opensource/partial/include/

...

RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/pcitable
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/readme.txt
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/rhdd
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/rhel-install-step1.sh
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/rhel-install-step2.sh
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64: Cannot mkdir: Structure needs cleaning
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/install.sh
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64: Cannot mkdir: Input/output error tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/install.sh: Cannot open: No such file or directory
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/installmethod.py
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64: Cannot mkdir: Input/output error tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/installmethod.py: Cannot open: No such file or directory
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modinfo
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64: Cannot mkdir: Input/output error tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modinfo: Cannot open: No such file or directory
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modules.alias
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64: Cannot mkdir: Input/output error tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modules.alias: Cannot open: No such file or directory
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modules.cgz

gzip: tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64: Cannot mkdir: Input/output errorstdin: Input/output error

tar: Unexpected EOF in archive
tar: RR274x/Driver/Linux/RHEL_CentOS: Cannot utime: Input/output error
tar: RR274x/Driver/Linux/RHEL_CentOS: Cannot change ownership to uid 0, gid 1000: Input/output error tar: RR274x/Driver/Linux/RHEL_CentOS: Cannot change mode to rwxr-xr-x: Input/output error
tar: RR274x/Driver/Linux: Cannot utime: Input/output error
tar: RR274x/Driver/Linux: Cannot change ownership to uid 0, gid 1000: Input/output error tar: RR274x/Driver/Linux: Cannot change mode to rwxr-xr-x: Input/output error
tar: RR274x/Driver: Cannot utime: Input/output error
tar: RR274x/Driver: Cannot change ownership to uid 0, gid 1000: Input/output error
tar: RR274x/Driver: Cannot change mode to rwxr-xr-x: Input/output error
tar: RR274x: Cannot utime: Input/output error
tar: RR274x: Cannot change ownership to uid 0, gid 1000: Input/output error
tar: RR274x: Cannot change mode to rwxr-xr-x: Input/output error
tar: Error is not recoverable: exiting now


dmesg:
[131329.013475] XFS (md0): Mounting V4 Filesystem
[131329.918438] XFS (md0): Ending clean mount
[131499.357099] XFS (md0): Mounting V4 Filesystem
[131499.709248] XFS (md0): Ending clean mount
[131874.545344] loop: module loaded
[131874.549914] XFS (loop0): Mounting V4 Filesystem
[131874.555540] XFS (loop0): Ending clean mount
[132020.964431] XFS (loop0): xfs_iread: validation failed for inode 124656869424 failed [132020.964435] ffff88028b078000: 49 4e 00 00 03 02 00 00 00 30 00 70 00 00 03 e8 IN.......0.p.... [132020.964437] ffff88028b078010: 00 00 00 00 06 20 b0 6f 01 2e 00 00 00 00 00 16 ..... .o........ [132020.964438] ffff88028b078020: 01 57 37 fd 2b 5d 22 9e 1e 0a 61 8c 00 00 00 20 .W7.+]"...a.... [132020.964440] ffff88028b078030: ff ff 00 d2 1b f6 27 90 00 00 00 00 00 00 00 00 ......'......... [132020.964454] XFS (loop0): Internal error xfs_iread at line 392 of file /build/linux-QZaPpC/linux-3.16.7-ckt11/fs/xfs/xfs_inode_buf.c. Caller xfs_iget+0x24b/0x690 [xfs] [132020.964457] CPU: 2 PID: 21474 Comm: tar Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1 [132020.964459] Hardware name: To be filled by O.E.M. To be filled by O.E.M./SABERTOOTH 990FX R2.0, BIOS 1503 01/11/2013 [132020.964460] 0000000000000001 ffffffff8150b405 ffff880424059800 ffffffffa09115cb [132020.964463] 0000018800000010 ffffffffa0916f6b ffff88030f5c6c00 ffff880424059800 [132020.964465] 0000000000000075 ffff8800ad1afe98 ffffffffa095cb3a ffffffffa0916f6b
[132020.964467] Call Trace:
[132020.964471]  [<ffffffff8150b405>] ? dump_stack+0x41/0x51
[132020.964478]  [<ffffffffa09115cb>] ? xfs_corruption_error+0x5b/0x80 [xfs]
[132020.964483]  [<ffffffffa0916f6b>] ? xfs_iget+0x24b/0x690 [xfs]
[132020.964492]  [<ffffffffa095cb3a>] ? xfs_iread+0xea/0x400 [xfs]
[132020.964497]  [<ffffffffa0916f6b>] ? xfs_iget+0x24b/0x690 [xfs]
[132020.964503]  [<ffffffffa0916f6b>] ? xfs_iget+0x24b/0x690 [xfs]
[132020.964511]  [<ffffffffa0956de6>] ? xfs_ialloc+0xa6/0x500 [xfs]
[132020.964517]  [<ffffffffa092658e>] ? kmem_zone_alloc+0x6e/0xe0 [xfs]
[132020.964525]  [<ffffffffa09572a2>] ? xfs_dir_ialloc+0x62/0x2a0 [xfs]
[132020.964531]  [<ffffffffa09251e5>] ? xfs_trans_reserve+0x1f5/0x200 [xfs]
[132020.964538]  [<ffffffffa09579a9>] ? xfs_create+0x489/0x700 [xfs]
[132020.964541]  [<ffffffff811b40ea>] ? kern_path_create+0xaa/0x190
[132020.964548]  [<ffffffffa091c5ea>] ? xfs_generic_create+0xca/0x250 [xfs]
[132020.964550]  [<ffffffff811b7ad0>] ? vfs_mkdir+0xb0/0x160
[132020.964551]  [<ffffffff811b868b>] ? SyS_mkdirat+0xab/0xe0
[132020.964554] [<ffffffff815115cd>] ? system_call_fast_compare_end+0x10/0x15
[132020.964555] XFS (loop0): Corruption detected. Unmount and run xfs_repair
[132020.964564] XFS (loop0): Internal error xfs_trans_cancel at line 959 of file /build/linux-QZaPpC/linux-3.16.7-ckt11/fs/xfs/xfs_trans.c. Caller xfs_create+0x2b2/0x700 [xfs] [132020.964566] CPU: 2 PID: 21474 Comm: tar Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1 [132020.964567] Hardware name: To be filled by O.E.M. To be filled by O.E.M./SABERTOOTH 990FX R2.0, BIOS 1503 01/11/2013 [132020.964568] 000000000000000c ffffffff8150b405 ffff8800ad1afe98 ffffffffa0925e07 [132020.964570] ffff880002530800 ffff880079e03ec8 ffff880424059800 ffffffffa09577d2 [132020.964571] 0000000000000001 ffff880079e03e20 ffff880079e03e1c ffff880079e03eb0
[132020.964573] Call Trace:
[132020.964575]  [<ffffffff8150b405>] ? dump_stack+0x41/0x51
[132020.964581]  [<ffffffffa0925e07>] ? xfs_trans_cancel+0xc7/0xf0 [xfs]
[132020.964588]  [<ffffffffa09577d2>] ? xfs_create+0x2b2/0x700 [xfs]
[132020.964590]  [<ffffffff811b40ea>] ? kern_path_create+0xaa/0x190
[132020.964596]  [<ffffffffa091c5ea>] ? xfs_generic_create+0xca/0x250 [xfs]
[132020.964598]  [<ffffffff811b7ad0>] ? vfs_mkdir+0xb0/0x160
[132020.964600]  [<ffffffff811b868b>] ? SyS_mkdirat+0xab/0xe0
[132020.964602] [<ffffffff815115cd>] ? system_call_fast_compare_end+0x10/0x15 [132020.964604] XFS (loop0): xfs_do_force_shutdown(0x8) called from line 960 of file /build/linux-QZaPpC/linux-3.16.7-ckt11/fs/xfs/xfs_trans.c. Return address = 0xffffffffa0925e20 [132021.196487] XFS (loop0): Corruption of in-memory data detected. Shutting down filesystem [132021.196491] XFS (loop0): Please umount the filesystem and rectify the problem(s)
[132024.791456] XFS (loop0): xfs_log_force: error 5 returned.
[132054.854625] XFS (loop0): xfs_log_force: error 5 returned.
[132084.917775] XFS (loop0): xfs_log_force: error 5 returned.
[132114.980927] XFS (loop0): xfs_log_force: error 5 returned.
[132145.044086] XFS (loop0): xfs_log_force: error 5 returned.
[132175.107307] XFS (loop0): xfs_log_force: error 5 returned.
[132205.170404] XFS (loop0): xfs_log_force: error 5 returned.
[132235.233587] XFS (loop0): xfs_log_force: error 5 returned.


On 8/2/2015 3:24 PM, Leslie Rhorer wrote:

     OK, this is goofy.  It seems to be working, now.  As usual, I've
been doing some work on the server this weekend, but I can't think of
anything I have done that would fix the issue.  I did replace the
remaining good 4G RAM module with a pair of 8G RAM modules, but memtest
reported the remaining 4G module as good, and I verified the removed
module really was bad.  I also replaced the removable drive carrier and
cables that were feeding the two SSDs, once of which was reporting
failures as noted in the syslog.  It's hard for me to believe either of
those things could have been causing the issue, though.

     I attached a 1.5T external drive to the server and formatted it as
XFS in preparation to continue troubleshooting.  To make sure of things,
I tried decompressing the tarball, again, and this time it worked all
the way to the end.  I then deleted the entire directory structure
created by the tarball and decompressed the file again twice.  I'll see
if the rsync process works.  That will take a couple of days.

On 7/28/2015 5:11 PM, Brian Foster wrote:
On Tue, Jul 28, 2015 at 10:13:01AM -0500, Leslie Rhorer wrote:
On 7/28/2015 7:33 AM, Brian Foster wrote:
On Tue, Jul 28, 2015 at 02:46:45AM -0500, Leslie Rhorer wrote:
On 7/20/2015 6:17 AM, Brian Foster wrote:
On Sat, Jul 18, 2015 at 08:02:50PM -0500, Leslie Rhorer wrote:

...

    I then copied both the tarball and the image over to the root,
and while
the system would not let me create the image on the root, it did
let me copy
the image to the root.  I then umounted the RAID array, mounted the
image,
and attempted to cd to the original directory in the image mount
where the
tarball was saved.  That failed with an I/O error:


It sounds a bit strange for the mdrestore to fail on root but a cp of
the resulting image to work. Do the resulting images have the same file
size or is the rootfs copy truncated? If the latter, you could be
missing part of the fs and thus any of the following tests are probably
moot.

    Well, it can't be as large as it is reported, let's put it that way,
although the reported file size is the same.  Ls claims it to be 16T in
size, which cannot be the case on a 100G partition.  I forgot to
mention cp
does complain:

RAID-Server:/# cp /RAID/TEST/RAIDfile.img ./
cp: cannot lseek ‘./RAIDfile.img’: Invalid argument

    But it does the same thing on the backup server, and it works
there.  I
tried a cmp, and it seems to be hung.  It just may be taking a long
time,
however.


Yeah, you can't really trust the resulting image. It doesn't take much
space to create a very large sparse file, but different filesystems have
different maximum file size limits. The problem here is that some
metadata near the beginning of the file might reference or depend on
something near the end, and I/Os beyond the end of the file will
probably result in errors.

I'd probably try the nouuid approach since the hardware is similar as
well as some of the other interesting suggestions that have been made to
try and get the image on the rootfs and see what happens there too.

Brian

Brian

RAID-Server:/# cd "/media/Server-Main/Equipment/Drive
Controllers/HighPoint
Adapters/Rocket 2722/Driver/"
bash: cd: /media/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters/Rocket 2722/Driver/: Input/output error

    I changed directories to a point two directories above the
previous attempt
and did a long listing:

RAID-Server:/# cd "/media/Server-Main/Equipment/Drive
Controllers/HighPoint
Adapters"
RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters# ll
ls: cannot access RocketRAID 2722: Input/output error
total 4
drwxr-xr-x 6 root lrhorer 4096 Jul 18 19:26 Rocket 2722
?????????? ? ?    ?          ?            ? RocketRAID 2722

    As you can see, Rocket 2722 is still there, but RocketRAID 2722
is very
sick.  Rocket 2722 is the parent of where the tarbal was, however,
so I did
a cd and an ll again:

RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters# cd "Rocket 2722"/
RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters/Rocket 2722# ll
ls: cannot access BIOS: Input/output error
ls: cannot access Driver: Input/output error
ls: cannot access HighPoint RAID Management Software: Input/output
error
ls: cannot access Manual: Input/output error
total 248
-rwxr--r-- 1 root lrhorer 245760 Nov 20  2008 autorun.exe
-rwxr--r-- 1 root lrhorer     51 Mar 21  2001 autorun.inf
?????????? ? ?    ?            ?            ? BIOS
?????????? ? ?    ?            ?            ? Driver
?????????? ? ?    ?            ?            ? HighPoint RAID
Management
Software
?????????? ? ?    ?            ?            ? Manual
-rwxr--r-- 1 root lrhorer   1134 Feb  5  2012 readme.txt

    So now, what?

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs


_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs




[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux