Re: Weird xfs_repair error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 24, 2017 at 04:27:28PM +0200, Emmanuel Florac wrote:
> Le Mon, 17 Jul 2017 13:11:29 -0400
> Brian Foster <bfoster@xxxxxxxxxx> écrivait:
> 
> > On Tue, Jul 11, 2017 at 03:23:52PM +0200, Emmanuel Florac wrote:
> > > Le Fri, 7 Jul 2017 08:36:33 -0700
> > > "Darrick J. Wong" <darrick.wong@xxxxxxxxxx> écrivait:
> > >   
> > > > > fatal error -- name create failed in lost+found (28), filesystem
> > > > > may be out of space    
> > > > 
> > > > Would be helpful to have a metadump of this goobered-up lost+found
> > > > fs...
> > > >   
> > > 
> > > The metadump is here for anyone who would like to have a look:
> > > 
> > > http://update2.intellique.com/pub/bign.metadump.xz
> > > 
> > > The filesystem is about 115 TiB.
> > >   
> > 
> > Thanks for posting this. The first thing to note is that this
> > filesystem is severely corrupted.
> 
> This I have determined myself through the fact that many runs of
> xfs_repair (and different versions of it, v4.7, 4.9, 4.11...) can't get
> it into a stable (i.e. that won't crash while trying to access it)
> state.
> 
> > Nonetheless, I've been playing
> > around with trying to get the latest for-next xfs_repair to run
> > through this fs (via gdb) and have definitely hit a few issues:
> > 
> > - xfs_sb_verify() was changed to use bp->b_maps[0].bm_bn rather than
> >   bp->b_bn in libxfs commit 85428dd23f ("xfs: fix superblock
> > inprogress check"). b_maps isn't allocated if the buffer was
> > initialized with libxfs_initbuf() (rather than libxfs_initbuf_map()).
> > This causes a sigsegv here, though only if I disable -O2 optimization
> > for some reason that I haven't dug into yet.
> > - libxfs commit 0268fdc3fe ("xfs: remove xfs_trans_get_block_res")
> >   replaced the use of xfs_trans_get_block_res() in
> >   xfs_bmbt_alloc_block() which causes the -ENOSPC error. The previous
> >   function was hardcoded to return 1 such that this would never occur.
> > - The recently added directory sf format verifier (xfs_iformat_fork()
> > -> xfs_dir2_sf_verify()) seems to cause a premature repair failure in
> > at least one case.
> > 
> > I was able to eventually get repair to complete with some quick hacks
> > to bypass those issues. I did have to run repair two or three times
> > to get the fs to a clean state. The fs mounts and otherwise appears
> > clean to xfs_repair, but it's not clear to me how usable the
> > resulting fs really is (repair is for fs consistency after all, not
> > necessarily data recovery). Note that lost+found appears to be loaded
> > with 18T of data across almost 2 million inodes. :/
> 
> Thank you for your efforts, the loaded lost+found matches my own
> results, however some of the files there have been present for possibly
> years. In fact this filesystem has crashed several times in the past
> years but always went back online at some point, until... now.
> 
> So what could I do, at least to be able to mount it and copy everything
> elsewhere before mkfs'ing it all again? Do you have an xfs_repair
> binary at hand that I could use, or should I dig into the latest
> source?
> 

There are several fixes in-flight for the issues uncovered by this
metadump. I think you'll want to include the following 3 patches to
xfsprogs:

http://marc.info/?l=linux-xfs&m=150047977108174&w=2
http://marc.info/?l=linux-xfs&m=150040481220074&w=2
http://marc.info/?l=linux-xfs&m=150040481820076&w=2

Note that the last 2 patches are probably going to be reworked into a
different implementation. The idea here is ultimately to avoid running
the verifier in a case where it disrupts xfs_repair, so using this
intermediate patch series should be good enough to build a custom binary
that allows xfs_repair to eventually piece the fs back together. You
could alternatively just hack xfs_dir2_sf_verify() to return 0.

Note that I would highly recommend to test whatever you build against
your metadump before the original fs.

Brian

> -- 
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |	<eflorac@xxxxxxxxxxxxxx>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------


--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux