Re: XFS disaster recovery

Sean Caron <scaron@xxxxxxxxx> · Tue, 8 Feb 2022 10:46:45 -0500

Hi Dave,

I'm sorry for some imprecise language. The array is around 450 TB raw
and I will refer to it as roughly half a petabyte but factoring out
RAID parity disks and spare disks it should indeed be around 384 TB
formatted.

I checked over the metadump with xfs_db as you suggest and it looks
like it dumped all AGs that the filesystem had.

# ./xfs_db -r /exports/home/work/md4.img
Metadata CRC error detected at 0x555a5c9dc8fe, xfs_agf block 0x4d7fffd948/0x1000
xfs_db: cannot init perag data (74). Continuing anyway.
xfs_db> sb 0
xfs_db> p agcount
agcount = 350
xfs_db> agf 349
xfs_db> p
magicnum = 0x58414746
versionnum = 1
seqno = 349
length = 82676200
bnoroot = 11
cntroot = 9
rmaproot =
refcntroot =
bnolevel = 2
cntlevel = 2
rmaplevel = 0
refcntlevel = 0
rmapblocks = 0
refcntblocks = 0
flfirst = 576
fllast = 581
flcount = 6
freeblks = 12229503
longest = 55037
btreeblks = 545
uuid = 4f39a900-91fa-4c5d-ba34-b56e77720db3
lsn = 0x1bb700239100
crc = 0xd329ccfc (correct)
xfs_db>

And looking at the image the size is roughly what it should be with
the actual size of the filesystem and the size of the sparse file
image looks sane.

# ls -l /exports/home/work/
total 157159048
-rw-r--r-- 1 root root 384068188372992 Feb  7 22:02 md4.img
-rw-r--r-- 1 root root     53912722432 Feb  7 21:59 md4.metadump
-rw-r--r-- 1 root root     53912722432 Feb  7 16:50 md4.metadump.save
# du -sh /exports/home/work/md4.img
50G     /exports/home/work/md4.img
#

I also apologize, in my last email I accidentally ran the copy of
xfs_repair that was installed from the Ubuntu package manager (old -
4.9.0) instead of the copy that I built from the dev tree.

I took advantage of this test environment to just run a bunch of
experiments and see what happened.

I found that if I ran the dev tree xfs_repair with the -P option, I
could get xfs_repair to complete a run. It exits with return code 130
but the resulting loopback image filesystem is mountable and I see
around 27 TB in lost+found which would represent around 9% loss in
terms of what was actually on the filesystem.

Given where we started I think this is acceptable (more than
acceptable, IMO, I was getting to the point of expecting to have to
write off the majority of the filesystem) and it seems like a way
forward to get the majority of the data off this old filesystem.

Is there anything further I should check or any caveats that I should
bear in mind applying this xfs_repair to the real filesystem? Or does
it seem reasonable to go ahead, repair this and start copying off?

Thanks so much for all your help so far,

Sean

On Mon, Feb 7, 2022 at 8:51 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>
> On Mon, Feb 07, 2022 at 05:56:21PM -0500, Sean Caron wrote:
> > Got it. I ran an xfs_repair on the simulated metadata filesystem and
> > it seems like it almost finished but errored out with the message:
> >
> > fatal error -- name create failed in lost+found (28), filesystem may
> > be out of space
>
> Not a lot to go on there - can you send me the entire reapir output?
>
> > However there is plenty of space on the underlying volume where the
> > metadata dump and sparse image are kept. Even if the sparse image was
> > actually 384 TB as it shows up in "ls", there's 425 TB free on the
> > volume where it's kept.
>
> Hmmm - the sparse image should be the same size as the filesystem
> itself. If it's only 384TB and not 500TB, then either the metadump
> or the restore may not have completed fully.
>
> > I wonder since this was a fairly large filesystem (~500 TB) it's
> > hitting some kind of limit somewhere with the loopback device?
>
> Shouldn't - I've used larger loopback files hostsed on XFS
> filesystems in the past.
>
> > Any thoughts on how I might be able to move past this? I guess I will
> > need to xfs_repair this filesystem one way or the other anyway to get
> > anything off of it, but it would be nice to run the simulation first
> > just to see what to expect.
>
> I think that first we need to make sure that the metadump and
> restore process was completed successfully (did you check the exit
> value was zero?). xfs_db can be used to do that:
>
> # xfs_db -r <image-file>
> xfs_db> sb 0
> xfs_db> p agcount
> <val>
> xfs_db> agf <val - 1>
> xfs_db> p
> .....
> (should dump the last AGF in the filesystem)
>
> If that works, then the metadump/restore should have been complete,
> and the size of the image file should match the size of the
> filesystem that was dumped...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx