Re: xfs_repair doesn't handle: br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0 corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



W dniu 15.07.2020 o 13:40, Brian Foster pisze:
> On Wed, Jul 15, 2020 at 09:05:47AM +0200, Arkadiusz Miśkiewicz wrote:
>>
>> Hello.
>>
>> xfs_repair (from for-next from about 2-3 weeks ago) doesn't seem to
>> handle such kind of corruption. Repair (few times) finishes just fine
>> but it ends up again with such trace.
>>
> 
> Are you saying that xfs_repair eventually resolves the corruption but it
> takes multiple tries, and then the corruption reoccurs at runtime? Or
> that xfs_repair doesn't ever resolve the corruption?
> 
> Either way, what does xfs_repair report?

http://ixion.pld-linux.org/~arekm/xfs/xfs-repair.txt

This is repair that I did back in 2020 on medadumped image (linked below)


But I also did repair recently with xfsprogs 5.10.0

http://ixion.pld-linux.org/~arekm/xfs/xfs-repair-sdd1-20210228.txt

on actual fs and today it crashed:

[ 3580.278435] XFS (sdd1): xfs_dabuf_map: bno 8388608 dir: inode 36509341678
[ 3580.278436] XFS (sdd1): [00] br_startoff 8388608 br_startblock -2
br_blockcount 1 br_state 0
[ 3580.278452] XFS (sdd1): Internal error xfs_da_do_buf(1) at line 2557
of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_da_read_buf+0x7c/0x130
[xfs]

so 5.10.0 repair also doesn't fix it.

> 
>> Metadump is possible but problematic (will be huge).
>>
> 
> How huge? Will it compress?

53GB

http://ixion.pld-linux.org/~arekm/xfs/sdd1.metadump.gz


> 
>>
>> Jul  9 14:35:51 x kernel: XFS (sdd1): xfs_dabuf_map: bno 8388608 dir:
>> inode 21698340263
>> Jul  9 14:35:51 x kernel: XFS (sdd1): [00] br_startoff 8388608
>> br_startblock -2 br_blockcount 1 br_state 0
> 
> It looks like we found a hole at the leaf offset of a directory. We'd
> expect to find a leaf or node block there depending on the directory
> format (which appears to be node format based on the stack below) that
> contains hashval lookup information for the dir.
> 
> It's not clear how we'd get into this state. Had this system experienced
> any crash/recovery sequences or storage issues before the first
> occurrence?

Yes, not once, that's my "famous" server which saw a lot of fs damage.

Anyway would be nice if repair could fix such messed startblock because
kernel crashes on it so easily (or at least I assume it's because of that).

> 
> Brian
> 
>> Jul  9 14:35:51 x kernel: XFS (sdd1): Internal error xfs_da_do_buf(1) at
>> line 2557 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller
>> xfs_da_read_buf+0x6a/0x120 [xfs]
>> Jul  9 14:35:51 x kernel: CPU: 3 PID: 2928 Comm: cp Tainted: G
>>   E     5.0.0-1-03515-g3478588b5136 #10
>> Jul  9 14:35:51 x kernel: Hardware name: Supermicro X10DRi/X10DRi, BIOS
>> 3.0a 02/06/2018
>> Jul  9 14:35:51 x kernel: Call Trace:
>> Jul  9 14:35:51 x kernel:  dump_stack+0x5c/0x80
>> Jul  9 14:35:51 x kernel:  xfs_dabuf_map.constprop.0+0x1dc/0x390 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_da_read_buf+0x6a/0x120 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_da3_node_read+0x17/0xd0 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_da3_node_lookup_int+0x6c/0x370 [xfs]
>> Jul  9 14:35:51 x kernel:  ? kmem_cache_alloc+0x14e/0x1b0
>> Jul  9 14:35:51 x kernel:  xfs_dir2_node_lookup+0x4b/0x170 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_dir_lookup+0x1b5/0x1c0 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_lookup+0x57/0x120 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_vn_lookup+0x70/0xa0 [xfs]
>> Jul  9 14:35:51 x kernel:  __lookup_hash+0x6c/0xa0
>> Jul  9 14:35:51 x kernel:  ? _cond_resched+0x15/0x30
>> Jul  9 14:35:51 x kernel:  filename_create+0x91/0x160
>> Jul  9 14:35:51 x kernel:  do_linkat+0xa5/0x360
>> Jul  9 14:35:51 x kernel:  __x64_sys_linkat+0x21/0x30
>> Jul  9 14:35:51 x kernel:  do_syscall_64+0x55/0x100
>> Jul  9 14:35:51 x kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>>
>> Longer log:
>> http://ixion.pld-linux.org/~arekm/xfs-10.txt
>>
>>
>> -- 
>> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
>>
> 

(resend because vger still blocks my primary maven domain and most
likely nothing has changed with postmasters attitude, didn't try... :/ )

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux