Re: XFS File system in trouble

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/20/2015 3:05 AM, Martin Papik wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512


Since you've already found one HW related fault, would you consider
booting into memtest for a couple of passes just to be on the safe
side.

I did that after confirming the one stick of memory was bad. Twice. I got over 20,000 errors on the bad stick, and 0 on the good one. I also swapped the locations on the motherboard, and the bad stick still failed while the good one passed 100%.

And did you by any chance look at SMART if applicable and
possibly running a test on the drives.

Yes. SMART found no errors, but think about it. Every time tar tries to create a directory when untarring that file in that location, the file system croaks when it tries to create a directory. Not when reading and not when writing other than when it creates a directory. When I create the directory manualy, the process quits failing at that point and fails later on during a different directory create. The array remains intact when reading, and dmesg shows no drive errors. I've re-synced the array, which reads every byte on all 8 drives without a single mismatch - several times. To my knowledge, no read has ever failed except after the filesystem goes offline. I thought reads were failing during the CRC checks, but that was a red herring.

Another test I sometimes do
when I'm unsure about disks is "cat /dev/sda > /dev/null" (i.e. a
whole disk read test)

echo repair > /sys/block/md0/md/sync_action reads not one drive, but every byte on all 8 drives.

and see (dmesg) if any errors show up, unless

	'Nary one, and no mismatches.

you're willing to run badblocks in a read-write nondestructive mode.
In my experience the read test or badblocks can be run simultaneously
with smartctl -t long. But as a start I'd look at smartctl --all
/dev/sd? and see if there are any bad signs. I hope this helps. Good luck


On 07/20/2015 10:41 AM, Leslie Rhorer wrote:
On 7/19/2015 6:27 PM, Dave Chinner wrote:
On Sat, Jul 18, 2015 at 08:02:50PM -0500, Leslie Rhorer wrote:

I found the problem with md5sum (and probably nfs, as well).
One of the memory modules in the server was bad.  The problem
with XFS persists.  Every time tar tried to create the
directory:

Now you need to run xfs_repair.

I do that every time the array implodes.  It makes no difference.
It never mentions cleaning the structure tar says needs cleaning,
and the next time I run tar on that file, the filesystem craters.

_______________________________________________ xfs mailing list
xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBCgAGBQJVrKuzAAoJELsEaSRwbVYrdjoP/3n1W9YtcpdiDoylp6tDYcjF
vEVz7IWLv2cOky8Lp+0WAZ4Z0WMhcutFzT571H1Vc+jT/UgO25pQHa3yLYTboPuZ
+tBidVUycs7ZIr9QCZFs2uPQ/7YstamB+F7paCTMKtOJJr5CZLiYX4iyJ9sFmWVY
UFPAIhyoqD5CFgoaAkwCmk50kNiT0aPM7egizIUVEt14cWuxZxMN0NIJ5b0WJfAk
qtNQjstVI/xYDgsImm2ZAm19SfOG9ltm2G9zafRr6lR6rRtXjtZX8zEg0l/o9XUw
OifghjoSup8OCzvX6+4+Soj/3mCKZv4rkBm3exf4YzfQ9eVG6Ktele2rLIs1sl3O
hUrZUNEl8hYGJeb5gBHFV/TLWDMMwNde/6JiBVy0V8EbDF1lvR4jYpUwThOE0jyL
ZbzZe4N/B0qvB1OpLDkHrMVm9NPtDkfXdTtM2kRmo5955xtkK09yHF/v64kz7IKc
2rM5pOwTR6HWE8RF2j9UujgPjw6nEUuY01TvIMGYzMfkJTI+sVjeDQfwnPG8tzIa
x4uLa4vTrBD5IaICjAmQiY69qqmt5Vg42G4latZVTYQLelvWQ774mXZfgfT/GtbT
RKzVwvYowWr/EBhtp7ix/1rWANTFiX0lxOPnRmUFvu8UJnyZhR0/EYbJYy1+jTt7
O7hZMfAayQBsnVcSK1JC
=3Ubd
-----END PGP SIGNATURE-----


_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux