Hi Brian, Am Montag, den 21.01.2019, 13:11 -0500 schrieb Brian Foster: [...] > > So for the moment, here's the output of the above sequence. > > > > xfs_db> convert agno 5 agbno 7831662 fsb > > 0x5077806e (1350008942) > > xfs_db> fsb 0x5077806e > > xfs_db> type finobt > > xfs_db> print > > magic = 0x49414233 > > level = 1 > > numrecs = 335 > > leftsib = 7810856 > > rightsib = null > > bno = 7387612016 > > lsn = 0x6671003d9700 > > uuid = 026711cc-25c7-44b9-89aa-0aac496edfec > > owner = 5 > > crc = 0xe12b19b2 (correct) > > As expected, we have the inobt magic. Interesting that this is a fairly > full intermediate (level > 0) node. There is no right sibling, which > means we're at the far right end of the tree. I wouldn't mind poking > around a bit more at the tree, but that might be easier with access to > the metadump. I also think that xfs_repair would have complained were > something more significant wrong with the tree. > > Hmm, I wonder if the (lightly tested) diff below would help us catch > anything. It basically just splits up the currently combined inobt and > finobt I/O verifiers to expect the appropriate magic number (rather than > accepting either magic for both trees). Could you give that a try? > Unless we're doing something like using the wrong type of cursor for a > particular tree, I'd think this would catch wherever we happen to put a > bad magic on disk. Note that this assumes the underlying filesystem has > been repaired so as to try and detect the next time an on-disk > corruption is introduced. > > You'll also need to turn up the XFS error level to make sure this prints > out a stack trace if/when a verifier failure triggers: > > echo 5 > /proc/sys/fs/xfs/error_level > > I guess we also shouldn't rule out hardware issues or whatnot. I did > notice you have a strange kernel version: 4.19.4-holodeck10. Is that a > distro kernel? Has it been modified from upstream in any way? If so, I'd > strongly suggest to try and confirm whether this is reproducible with an > upstream kernel. With the finobt verifier changes applied we are unable to mount the FS, even after running xfs_repair. xfs_repair had found "bad magic # 0x49414233 in inobt block 5/2631703", which would be daddr 0x1b5db40b8 according to xfs_db. The mount trips over a buffer at a different daddr though: [ 73.237007] XFS (dm-3): Mounting V5 Filesystem [ 73.456481] XFS (dm-3): Ending clean mount [ 74.132671] XFS (dm-3): Metadata corruption detected at xfs_finobt_verify+0x50/0x90 [xfs], xfs_finobt block 0x1b5df7d50 [ 74.133028] XFS (dm-3): Unmount and run xfs_repair [ 74.133184] XFS (dm-3): First 128 bytes of corrupted metadata buffer: [ 74.133395] 00000000e44dfb87: 49 41 42 33 00 01 01 50 00 07 53 58 ff ff ff ff IAB3...P..SX.... [ 74.133679] 000000009f21b317: 00 00 00 01 b5 df 7d 50 00 00 00 00 00 00 00 00 ......}P........ [ 74.133964] 000000003429321b: 02 67 11 cc 25 c7 44 b9 89 aa 0a ac 49 6e df ec .g..%.D.....In.. [ 74.134272] 00000000fe79b835: 00 00 00 05 24 52 54 c6 32 dc 7d 00 32 e9 b9 a0 ....$RT.2.}.2... [ 74.134554] 00000000d1e887dc: 32 f4 97 80 33 01 36 80 33 09 ca 80 33 1e b7 80 2...3.6.3...3... [ 74.134852] 00000000612879d2: 33 2f 50 00 33 33 e8 80 33 40 a9 c0 33 4c 08 80 3/P.33..3@..3L.. [ 74.135140] 00000000e63fd33a: 33 64 d7 80 33 79 34 40 33 8f 08 80 33 a7 be c0 3d..3y4@3...3... [ 74.135427] 00000000d1c405d7: 33 b6 10 80 33 bf 1e c0 33 d0 99 00 33 df cd 00 3...3...3...3... [ 74.135871] XFS (dm-3): Metadata corruption detected at xfs_finobt_verify+0x50/0x90 [xfs], xfs_finobt block 0x1b5df7d50 [ 74.136231] XFS (dm-3): Unmount and run xfs_repair [ 74.136390] XFS (dm-3): First 128 bytes of corrupted metadata buffer: [ 74.136604] 00000000e44dfb87: 49 41 42 33 00 01 01 50 00 07 53 58 ff ff ff ff IAB3...P..SX.... [ 74.136887] 000000009f21b317: 00 00 00 01 b5 df 7d 50 00 00 00 00 00 00 00 00 ......}P........ [ 74.137174] 000000003429321b: 02 67 11 cc 25 c7 44 b9 89 aa 0a ac 49 6e df ec .g..%.D.....In.. [ 74.137463] 00000000fe79b835: 00 00 00 05 24 52 54 c6 32 dc 7d 00 32 e9 b9 a0 ....$RT.2.}.2... [ 74.137750] 00000000d1e887dc: 32 f4 97 80 33 01 36 80 33 09 ca 80 33 1e b7 80 2...3.6.3...3... [ 74.138035] 00000000612879d2: 33 2f 50 00 33 33 e8 80 33 40 a9 c0 33 4c 08 80 3/P.33..3@..3L.. [ 74.138358] 00000000e63fd33a: 33 64 d7 80 33 79 34 40 33 8f 08 80 33 a7 be c0 3d..3y4@3...3... [ 74.138639] 00000000d1c405d7: 33 b6 10 80 33 bf 1e c0 33 d0 99 00 33 df cd 00 3...3...3...3... [ 74.138964] XFS (dm-3): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x1b5df7d50 len 8 error 117 [ 78.489686] XFS (dm-3): Error -117 reserving per-AG metadata reserve pool. [ 78.489691] XFS (dm-3): xfs_do_force_shutdown(0x8) called from line 548 of file fs/xfs/xfs_fsops.c. Return address = 00000000b2beb4b0 [ 78.489697] XFS (dm-3): Corruption of in-memory data detected. Shutting down filesystem [ 78.489955] XFS (dm-3): Please umount the filesystem and rectify the problem(s) Is this a real issue, or false positive due to things working differently during early mount? Regards, Lucas