From: Bill O'Donnell <bodonnel@xxxxxxxxxx> For xfs_repair, there is a case when -EFSBADCRC is encountered but not acted on. Modify da_read_buf to check for and repair. The current implementation fails for the case: $ xfs_repair xfs_metadump_hosting.dmp.image Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 Metadata CRC error detected at 0x46cde8, xfs_dir3_block block 0xd3c50/0x1000 bad directory block magic # 0x16011664 in block 0 for directory inode 867467 corrupt directory block 0 for inode 867467 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 3 - agno = 2 bad directory block magic # 0x16011664 in block 0 for directory inode 867467 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... bad directory block magic # 0x16011664 for directory inode 867467 block 0: fixing magic # to 0x58444233 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Metadata corruption detected at 0x46cc88, xfs_dir3_block block 0xd3c50/0x1000 libxfs_bwrite: write verifier failed on xfs_dir3_block bno 0xd3c50/0x8 xfs_repair: Releasing dirty buffer to free list! xfs_repair: Refusing to write a corrupt buffer to the data device! xfs_repair: Lost a write to the data device! fatal error -- File system metadata writeout failed, err=117. Re-run xfs_repair. With the patch applied: $ xfs_repair xfs_metadump_hosting.dmp.image Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 Metadata CRC error detected at 0x46ce28, xfs_dir3_block block 0xd3c50/0x1000 bad directory block magic # 0x16011664 in block 0 for directory inode 867467 cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 bad directory block magic # 0x16011664 in block 0 for directory inode 867467 cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list Metadata CRC error detected at 0x46ce28, xfs_dir3_block block 0xd3c50/0x1000 cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list bad directory block magic # 0x16011664 for directory inode 867467 block 0: fixing magic # to 0x58444233 cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list rebuilding directory inode 867467 cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610) cache_node_put: node put on node (0x7f46ac0c5610) in MRU list - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done Signed-off-by: Bill O'Donnell <bodonnel@xxxxxxxxxx> --- repair/da_util.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/repair/da_util.c b/repair/da_util.c index 7f94f4012062..0a4785e6f69b 100644 --- a/repair/da_util.c +++ b/repair/da_util.c @@ -66,6 +66,9 @@ da_read_buf( } libxfs_buf_read_map(mp->m_dev, map, nex, LIBXFS_READBUF_SALVAGE, &bp, ops); + if (bp->b_error == -EFSBADCRC) { + libxfs_buf_relse(bp); + } if (map != map_array) free(map); return bp; -- 2.48.1