On Fri, May 29, 2015 at 03:03:57PM +0100, Mike Grant wrote: > We recently had a 180TB XFS filesystem go down after following some > ill-considered advice from a Dell tech (re-onlining a maybe-failed disk, > which one might think was ok..). It's not irreplaceable data, but > xfs_repair segfaults when trying to fix up and I thought that might be > of interest here to help fix the segfault. We're not expecting to > recover the data, though it would be nice. > > Partial logs & backtraces of xfs_repair runs using the latest Centos-7 > xfsprogs package and also run with the xfs_repair built from the git > master, copies of core dumps and a metadump are at: > https://rsg.pml.ac.uk/shared_files/mggr/xfs_segfault Given it is choking on directory corruption repair, I'd strong recommend trying the current git version (3.2.3-rc1) here: git://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git > Maximum memory use was only about 1GB by the time of the crash, and > there was 120GB+ of swap available, so I don't think that was an issue. > The command was "xfs_repair -v /dev/md0 -t 60 -P". > > Run time is about 2 hours to a crash and we'll probably want to wipe and Probably because you turned off prefetch, which makes it *slow*. :P I'd build the new xfsprogs, restore the metadump to a file on a different machine, and then run the new xfs_repair binary on the restored metadump image. That will tell you pretty quickly if the problem is solved. If it is solved, then you can run the new xfs_repair on the real server. Just remember, though, that even once the FS has been repaired, you'll still have to search for data corruption manually and deal with that... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs