On 7/20/2015 6:17 AM, Brian Foster wrote:
On Sat, Jul 18, 2015 at 08:02:50PM -0500, Leslie Rhorer wrote:
I found the problem with md5sum (and probably nfs, as well). One of the
memory modules in the server was bad. The problem with XFS persists. Every
time tar tried to create the directory:
/RAID/Server-Main/Equipment/Drive Controllers/HighPoint Adapters/Rocket 2722/Driver/RR276x/Driver/Linux/openSUSE/rr276x-suse-11.2-i386/linux/suse/i386-11.1
It would begin spitting out errors, starting with "Cannot mkdir: Structure
needs cleaning". At that point, XFS had shut down. I went into
/RAID/Server-Main/Equipment/Drive Controllers/HighPoint Adapters/Rocket
2722/Driver/RR276x/Driver/Linux/openSUSE/rr276x-suse-11.2-i386/linux/suse/
and created the i386-11.1 directory by hand, and tar no longer starts
spitting out errors at that point, but it does start up again at
RR2782/Windows/Vista-Win2008-Win7-legacy_single/x64.
So is this untar problem a reliable reproducer? If so, here's what I
The processes I was running this weekend ran longer than expected, and
in fact were still running just a couple of hours ago. I was doing an
rsync with CRC check from the backup system to the one with the problem.
There were a few corrupt files, but not a huge number. Although
slower than I hoped, everything was running fine until a short time ago,
when rsync encountered the very same issue I keep having with tar, which
is to say it tried to create a directory and the file system crashed
with precisely the same symptoms as when tar was failing.
would try to hopefully isolate a filesystem problem from something
underneath:
xfs_metadump -go /dev/md0 /somewhere/on/rootfs/md0.metadump
xfs_mdrestore -g /somewhere/on/rootfs/md0.metadump /.../fileonrootfs.img
mount /.../fileonrootfs.img /mnt/
I tried to do the xfs_mdrestore to the root file system, but it fails:
RAID-Server:/TEST# xfs_mdrestore -g md0.metadump RAIDfile.img
xfs_mdrestore: cannot set filesystem image size: File too large
So then I did the same thing to a directory on an nfs mount from
another machine. That worked. I then went to the other machine,
mounted the image on /media, copied the tarball to the location on the
mount where the tarball resides on the real array, dn ran the tar job.
It completed without errors.
I then created the image on the array where the tasks are failing and
attempted to mount it to /media on the problematic machine. That fails
with:
RAID-Server:/TEST# mount /RAID/TEST/RAIDfile.img /media/
mount: wrong fs type, bad option, bad superblock on /dev/loop0,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
The problem is this (from syslog):
Jul 28 01:53:48 RAID-Server kernel: [431155.847523] loop: module loaded
Jul 28 01:53:48 RAID-Server kernel: [431155.927238] XFS (loop0):
Filesystem has duplicate UUID 228cfaa7-ae6b-44fc-b703-1c32385231c0 -
can't mount
Jul 28 01:55:51 RAID-Server kernel: [431278.916490] XFS (loop0):
Filesystem has duplicate UUID 228cfaa7-ae6b-44fc-b703-1c32385231c0 -
can't mount
Presumably it has the same UUID as the RAID array because it is
expected to do so. I can't mount it unless I umount the RAID array, but
if I do that, I can't get to the file to mount the dump image, since it
is on the array.
I then copied both the tarball and the image over to the root, and
while the system would not let me create the image on the root, it did
let me copy the image to the root. I then umounted the RAID array,
mounted the image, and attempted to cd to the original directory in the
image mount where the tarball was saved. That failed with an I/O error:
RAID-Server:/# cd "/media/Server-Main/Equipment/Drive
Controllers/HighPoint Adapters/Rocket 2722/Driver/"
bash: cd: /media/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters/Rocket 2722/Driver/: Input/output error
I changed directories to a point two directories above the previous
attempt and did a long listing:
RAID-Server:/# cd "/media/Server-Main/Equipment/Drive
Controllers/HighPoint Adapters"
RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters# ll
ls: cannot access RocketRAID 2722: Input/output error
total 4
drwxr-xr-x 6 root lrhorer 4096 Jul 18 19:26 Rocket 2722
?????????? ? ? ? ? ? RocketRAID 2722
As you can see, Rocket 2722 is still there, but RocketRAID 2722 is very
sick. Rocket 2722 is the parent of where the tarbal was, however, so I
did a cd and an ll again:
RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters# cd "Rocket 2722"/
RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters/Rocket 2722# ll
ls: cannot access BIOS: Input/output error
ls: cannot access Driver: Input/output error
ls: cannot access HighPoint RAID Management Software: Input/output error
ls: cannot access Manual: Input/output error
total 248
-rwxr--r-- 1 root lrhorer 245760 Nov 20 2008 autorun.exe
-rwxr--r-- 1 root lrhorer 51 Mar 21 2001 autorun.inf
?????????? ? ? ? ? ? BIOS
?????????? ? ? ? ? ? Driver
?????????? ? ? ? ? ? HighPoint RAID Management
Software
?????????? ? ? ? ? ? Manual
-rwxr--r-- 1 root lrhorer 1134 Feb 5 2012 readme.txt
So now, what?
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs