Re: XFS File system in trouble

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 22, 2015 at 08:45:07PM -0500, Leslie Rhorer wrote:
> On 7/20/2015 6:17 AM, Brian Foster wrote:
> >On Sat, Jul 18, 2015 at 08:02:50PM -0500, Leslie Rhorer wrote:
> >>
> >>	I found the problem with md5sum (and probably nfs, as well).  One of the
> >>memory modules in the server was bad.  The problem with XFS persists.  Every
> >>time tar tried to create the directory:
> >>
> >>/RAID/Server-Main/Equipment/Drive Controllers/HighPoint Adapters/Rocket 2722/Driver/RR276x/Driver/Linux/openSUSE/rr276x-suse-11.2-i386/linux/suse/i386-11.1
> >>
> >>	It would begin spitting out errors, starting with "Cannot mkdir: Structure
> >>needs cleaning".  At that point, XFS had shut down.  I went into
> >>/RAID/Server-Main/Equipment/Drive Controllers/HighPoint Adapters/Rocket
> >>2722/Driver/RR276x/Driver/Linux/openSUSE/rr276x-suse-11.2-i386/linux/suse/
> >>and created the i386-11.1 directory by hand, and tar no longer starts
> >>spitting out errors at that point, but it does start up again at
> >>RR2782/Windows/Vista-Win2008-Win7-legacy_single/x64.
> >>
> >
> >So is this untar problem a reliable reproducer? If so, here's what I
> 
> 	Absolutely reliable producer.  The only change is if I create the offending
> directory by hand (after recovering the filesystem, of course) and then
> start the tar again.  Then it copies all the files into the previously
> offending directory, failing the next time it tries to create a directory.
> 
> >would try to hopefully isolate a filesystem problem from something
> >underneath:
> 
> 	OK.  Frankly, I fail to find it at all likely to be anything above.  I can
> read and write 100s of megabytes of data without an error.  The only thing
> that I can find failing is creating directories, and that is only when tar
> attempts it.  The directory structure is going to be written to different
> inodes as time goes by, so a failure of mdadm or some structure above it
> should cause other widesperead issues.  I need to try some other tarballs
> when I get the chance, and also try dumping that tar on a different
> directory.
> 

I wouldn't disagree, but I'd still run the test. ;)

> 
> >xfs_metadump -go /dev/md0 /somewhere/on/rootfs/md0.metadump
> >xfs_mdrestore -g /somewhere/on/rootfs/md0.metadump /.../fileonrootfs.img
> 
> 	How big are those files going to be, do you think?  The root partition is
> not all that huge.  There is only a little over 80G free.
> 

I'm not really sure. It depends on how much metadata is on the fs. FWIW,
the image should be compressible if you want to transfer it to another
server with more room to play with.

> >mount /.../fileonrootfs.img /mnt/
> >
> >... and repeat the test on that mount using the original tarball (if
> >it's on the associated fs, the version from the dump will have no data).
> 
> 	It is.  I've tried copying it to another fs, and it works fine, there.
> 
> >This will create a metadata only dump of the original fs onto another
> >storage device (e.g., whatever holds the root fs), restore the metadump
> >to a file and mount it loopback. The resulting fs will not contain any
> >file data, but will contain all of the metadata such as directory
> >structure, etc. and is otherwise mountable and usable for experimental
> >purposes.
> >
> >If the problem is in the filesystem or "above" (as in kernel, memory
> >issue, etc.), the test should fail on this mount. If the problem is
> >beneath the fs such as somewhere in the storage stack (assuming the
> >rootfs storage stack is reliable), it probably shouldn't fail.
> 
> 	I'll look into this when I can.  Right now I have some critical operations
> going on both servers (primary and backup), and I can't take down a file
> system or even risk doing so.  Hopefully I will get around to it this
> weekend.
> 

A positive side effect if the problem reproduces with the metadump is
you can potentially share a reproducer with the developers here. Note
that the image as created above would create an unobfuscated image and
thus contain original metadata (filenames, etc.) of the fs. You could
also create an obfuscated image (without the -o option) to scramble
filenames and whatnot, but I'd suggest to verify that the tarball test
reproduces on that independently.

Brian

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux