Re: Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dave,

[+Ted as the topic also applies to ext4]

On 3/17/22 04:08, Dave Chinner wrote:
On Thu, Mar 17, 2022 at 01:47:05PM +1100, Dave Chinner wrote:
On Wed, Mar 16, 2022 at 09:55:04AM +0100, Manfred Spraul wrote:
Hi Dave,

On 3/14/22 16:18, Manfred Spraul wrote:

But:

I've checked the eMMC specification, and the spec allows that teared write
happen:
Yes, most storage only guarantees that sector writes are atomic and
so multi-sector writes have no guarantees of being written
atomically.  IOWs, all storage technologies that currently exist are
allowed to tear multi-sector writes.

However, FUA writes are guaranteed to be whole on persistent storage
regardless of size when the hardware signals completion. And any
write that the hardware has signalled as complete before a cache
flush is received is also guaranteed to be whole on persistent
storage when the cache flush is signalled as complete by the
hardware. These mechanisms provide protection against torn writes.

My plan was to create a replay application that randomly creates disc images allowed by the writeback_cache_control documentation.

https://www.kernel.org/doc/html/latest/block/writeback_cache_control.html

And then check that the filesystem behaves as expected/defined.

The first step was: Implement the framework and just stop at a random location.

Is my understanding correct that XFS support neither eMMC nor NVM devices?
(unless there is a battery backup that exceeds the guarantees from the spec)
Incorrect.

They are supported just fine because flush/FUA semantics provide
guarantees against torn writes in normal operation. IOWs, torn
writes are something that almost *never* happen in real life, even
when power fails suddenly. Despite this, XFS can detect it has
occurred (because broken storage is all too common!), and if it
can't recovery automatically, it will shut down and ask the user to
correct the problem.

So for xfs the behavior should be:

- without torn writes: Mount always successful, no errors when accessing the content.

- with torn writes: There may be error that will be detected only at runtime. The errors may at the end cause a file system shutdown.

(commented dmesg is attached)

The application I have in mind are embedded systems.

I.e. there is no user that can correct something, the recovery strategy must be included in the design.

BTRFS and ZFS can also detect torn writes, and if you use the
(non-default) ext4 option "metadata_csum" it will also detect torn
Correction - metadata_csum is ienabled by default, I just ran the
wrong mkfs command when I tested it a few moments ago.

For ext4, I have seen so far only corrupted commit blocks that cause mount failures.

https://lore.kernel.org/all/8fe067d0-6d57-9dd7-2c10-5a2c34037ee1@xxxxxxxxxxxxxxxx/

But Ted didn't confirm yet that this is per design :-)


--

    Manfred
1) setup
[ 1591.878832] loop0: detected capacity change from 0 to 1024000

For info: md5sum of the image file:
	b7103b519ada7dc5281d7a42c29a4271  /tmp/mount_img-2536.img

2) mount. Command:
	mount -t auto /dev/loop0 x
[ 1591.911516] XFS (loop0): Mounting V5 Filesystem
[ 1591.945058] XFS (loop0): Starting recovery (logdev: internal)
[ 1591.949055] XFS (loop0): resetting quota flags
[ 1591.949590] XFS (loop0): Ending recovery (logdev: internal)

Especially: Corruption not noticed at mount time.

3) find x -type f
[ 1741.033535] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1741.033693] XFS (loop0): Unmount and run xfs_repair
[ 1741.033696] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1741.033700] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1741.033704] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1741.033706] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1741.033708] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1741.033711] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1741.033713] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1741.033715] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1741.033717] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1741.033761] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1741.033886] XFS (loop0): Unmount and run xfs_repair
[ 1741.033889] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1741.033893] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1741.033896] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1741.033899] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1741.033901] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1741.033903] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1741.033905] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1741.033907] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1741.033909] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1741.033920] XFS (loop0): metadata I/O error in "xfs_da_read_buf+0xb1/0x110 [xfs]" at daddr 0x86f58 len 8 error 74

--> corruption noticed at run time. FS tries to continue.

4) manual playing around in the filesystem:
	rm -Rf
	mv a ../b

[ 1824.642762] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1824.643000] XFS (loop0): Unmount and run xfs_repair
[ 1824.643006] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1824.643014] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1824.643020] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1824.643025] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1824.643030] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1824.643035] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1824.643040] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1824.643044] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1824.643049] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1824.643145] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1824.643361] XFS (loop0): Unmount and run xfs_repair
[ 1824.643366] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1824.643371] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1824.643377] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1824.643381] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1824.643386] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1824.643390] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1824.643395] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1824.643399] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1824.643403] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1824.643433] XFS (loop0): metadata I/O error in "xfs_da_read_buf+0xb1/0x110 [xfs]" at daddr 0x86f58 len 8 error 74
[ 1824.643749] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1824.643880] XFS (loop0): Unmount and run xfs_repair
[ 1824.643883] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1824.643887] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1824.643891] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1824.643893] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1824.643896] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1824.643898] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1824.643900] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1824.643903] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1824.643905] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1824.643948] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1824.644074] XFS (loop0): Unmount and run xfs_repair
[ 1824.644076] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1824.644079] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1824.644081] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1824.644084] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1824.644086] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1824.644088] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1824.644091] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1824.644093] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1824.644095] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1824.644107] XFS (loop0): metadata I/O error in "xfs_da_read_buf+0xb1/0x110 [xfs]" at daddr 0x86f58 len 8 error 74
[ 1838.578296] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1838.578452] XFS (loop0): Unmount and run xfs_repair
[ 1838.578456] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1838.578460] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1838.578464] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1838.578467] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1838.578470] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1838.578472] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1838.578475] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1838.578477] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1838.578479] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1838.578529] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1838.578699] XFS (loop0): Unmount and run xfs_repair
[ 1838.578703] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1838.578707] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1838.578711] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1838.578715] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1838.578719] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1838.578724] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1838.578728] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1838.578732] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1838.578736] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1838.578785] XFS (loop0): metadata I/O error in "xfs_da_read_buf+0xb1/0x110 [xfs]" at daddr 0x86f58 len 8 error 74
[ 1876.302019] XFS (loop0): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 
[ 1876.302177] XFS (loop0): Unmount and run xfs_repair
[ 1876.302180] XFS (loop0): First 128 bytes of corrupted metadata buffer:
[ 1876.302184] 00000000: 58 44 42 33 9f ab d7 f4 00 00 00 00 00 08 6f 58  XDB3..........oX
[ 1876.302188] 00000010: 00 00 00 0f 00 00 02 5c 53 5a 35 23 57 2c 4c f1  .......\SZ5#W,L.
[ 1876.302191] 00000020: 8d ac 7c b0 38 a9 ec b7 00 00 00 00 00 08 69 25  ..|.8.........i%
[ 1876.302194] 00000030: 02 28 0d 40 01 68 00 30 01 c8 00 30 00 00 00 00  .(.@.h.0...0....
[ 1876.302196] 00000040: 00 00 00 00 00 08 69 25 01 2e 02 00 00 00 00 40  ......i%.......@
[ 1876.302199] 00000050: 00 00 00 00 00 0c 0a 4d 02 2e 2e 02 00 00 00 50  .......M.......P
[ 1876.302201] 00000060: 00 00 00 00 00 08 69 26 0b 43 44 52 52 4f 4c 53  ......i&.CDRROLS
[ 1876.302204] 00000070: 2e 43 46 47 01 00 00 60 00 00 00 00 00 08 69 27  .CFG...`......i'
[ 1876.302221] XFS (loop0): metadata I/O error in "xfs_da_read_buf+0xb1/0x110 [xfs]" at daddr 0x86f58 len 8 error 74
[ 1876.302656] XFS (loop0): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x12f/0x2b0 [xfs] (fs/xfs/xfs_trans_buf.c:296).  Shutting down filesystem.
[ 1876.302912] XFS (loop0): Please unmount the filesystem and rectify the problem(s)


   -> Now I broke it :-(

4) umount
[ 1931.725585] XFS (loop0): Unmounting Filesystem

5) run xfs_repair.
	This fails due to a log that first needs to be replayed

6) mount -t auto /dev/loop0 x
[ 2041.247530] XFS (loop0): Mounting V5 Filesystem
[ 2041.251838] XFS (loop0): Starting recovery (logdev: internal)
[ 2041.252877] XFS (loop0): Ending recovery (logdev: internal)

7) umount
[ 2047.218062] XFS (loop0): Unmounting Filesystem

8) run xfs_repair
	This is successful.

[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux