Re: Weird xfs_repair error

Brian Foster <bfoster@xxxxxxxxxx> · Thu, 6 Jul 2017 09:48:05 -0400

On Thu, Jul 06, 2017 at 03:30:20PM +0200, Emmanuel Florac wrote:
> 
> After a RAID controller went bananas, I encountered an XFS corruption
> on a filesystem. Weirdly, the corruption seems to be mostly located in
> lost+found.
> 
> (I'm currently working on a metadump'd image of course, not the real
> thing; there are 90TB of data to be hopefully salvaged in there).
> 
> "ls /mnt/rescue/lost+found" gave this:
> 
> XFS (loop0): metadata I/O error: block 0x22b03f490
> ("xfs_trans_read_buf_map") error 117 numblks 16 
> XFS (loop0): xfs_imap_to_bp: xfs_trans_read_buf() returned error 117.
> XFS (loop0): Corruption detected. Unmount and run xfs_repair 
> XFS (loop0): Corruption detected. Unmount and run xfs_repair
> 
> I've run xfs_repair 4.9 on the xfs_mdrestored image. It dumps an insane
> lot of errors (the output log is 65MB)  and ends with this very strange
> message:
> 
> disconnected inode 26417467, moving to lost+found
> disconnected inode 26417468, moving to lost+found
> disconnected inode 26417469, moving to lost+found
> disconnected inode 26417470, moving to lost+found
> 
> fatal error -- name create failed in lost+found (117), filesystem may
> be out of space
> 
> Even stranger, after mounting back the image, there is no lost+found
> anywhere to be found! However the filesystem has lots of free space and
> free inodes, how come?
> 

Did you originally run xfs_repair using the -n option? I'd guess not if
it ultimately failed making a modification, but if so, something to be
aware of is that it skips warning about a dirty log and potentially can
report much more corruption than after a log recovery occurs. It might
be worth running after an attempted log recovery.

Otherwise, I'd be curious about the state of the fs after the above
error. Does 'xfs_repair -n' continue to report errors?

Also the above suggests that lost+found existed (in a corrupted state)
prior to the initial repair attempt, yes? If so, it might be interesting
to identify the inode # of lost+found to follow what xfs_repair does to
that inode during the initial run (e.g., if lost+found is corrupted and
is attempted to be used before it is fixed up or something of that
nature).

Brian

> df -i
> Sys. fich.                    Inodes  IUtil.     ILibre IUti% Monté sur
> rootfs                             0       0          0     - /
> /dev/root                          0       0          0     - /
> tmpfs                        2058692     990    2057702    1% /run
> tmpfs                        2058692       6    2058686    1% /run/lock
> tmpfs                        2058692    1623    2057069    1% /dev
> tmpfs                        2058692       3    2058689    1% /run/shm
> guitare:/mnt/raid/partage   33554432  305069   33249363    1% /mnt/qnap1
> /dev/loop0                4914413568 5199932 4909213636    1% /mnt/rescue
> 
> df
> /dev/loop0                122858252288 88827890868 34030361420  73% /mnt/rescue
> 
> I'll give a shot to a newer version of xfs_repair just in case...
> 
> -- 
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |	<eflorac@xxxxxxxxxxxxxx>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html