Dear XFS people, I have bumped in to a corruption problem with one a XFS filesystems. The filesystem lives on a RAID6-volume on a 3ware 9650SE-24M8 with battery backup and writecache enabled. RAID6-configuration is 11 2.0TB WD15EARS disks and the volume is reported as OK by the RAID-card. I believe the corruption below happened when the RAID-card reset itself, due to disk timeouts on another RAID6-volume on the same controller card (different story). I have tried to gather some relevant information, in hope that someone can point me in the right direction repairing this corruption. Kernel: 2.6.26-2-amd64 OS: Debian Linux lenny 64-bit xfsprogs: 2.9.8 Output from xfs_info: meta-data=/dev/sda1 isize=256 agcount=13, agsize=268435455 blks = sectsz=512 attr=2 data = bsize=4096 blocks=3295874295, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 mounting the filesystems gives the following in dmesg: [858397.713452] Starting XFS recovery on filesystem: sda1 (logdev: internal) [858403.841603] Filesystem "sda1": XFS internal error xfs_btree_check_sblock at line 334 of file fs/xfs/xfs_btree.c. Caller 0xffffffffa0138321 [858403.841603] Pid: 31433, comm: mount Not tainted 2.6.26-2-amd64 #1 [858403.841603] [858403.841603] Call Trace: [858403.841603] [<ffffffffa0138321>] :xfs:xfs_alloc_lookup+0x133/0x34f [858403.841603] [<ffffffffa014c7fb>] :xfs:xfs_btree_check_sblock+0xaf/0xbf [858403.841603] [<ffffffffa0138321>] :xfs:xfs_alloc_lookup+0x133/0x34f [858403.841603] [<ffffffffa014c322>] :xfs:xfs_btree_init_cursor+0x31/0x1ae [858403.841603] [<ffffffffa0135d17>] :xfs:xfs_free_ag_extent+0x63/0x6b5 [858403.841603] [<ffffffff8042a354>] __down_read+0x12/0xa1 [858403.841603] [<ffffffffa01379dd>] :xfs:xfs_free_extent+0xa9/0xc9 [858403.841603] [<ffffffffa01694b3>] :xfs:xlog_recover_process_efi+0x10e/0x167 [858403.841603] [<ffffffffa016a6a4>] :xfs:xlog_recover_process_efis+0x4b/0x85 [858403.841603] [<ffffffffa016a6f3>] :xfs:xlog_recover_finish+0x15/0xb5 [858403.841603] [<ffffffffa016f2f7>] :xfs:xfs_mountfs+0x475/0x5ac [858403.841603] [<ffffffffa017a311>] :xfs:kmem_alloc+0x60/0xc4 [858403.841603] [<ffffffffa0174eb4>] :xfs:xfs_mount+0x29b/0x347 [858403.841603] [<ffffffffa01833e6>] :xfs:xfs_fs_fill_super+0x0/0x1ee [858403.841603] [<ffffffffa018349b>] :xfs:xfs_fs_fill_super+0xb5/0x1ee [858403.841603] [<ffffffff8029d334>] get_sb_bdev+0xf8/0x145 [858403.841603] [<ffffffff8029cd58>] vfs_kern_mount+0x93/0x11b [858403.841603] [<ffffffff8029ce33>] do_kern_mount+0x43/0xe3 [858403.841603] [<ffffffff802b18c9>] do_new_mount+0x5b/0x95 [858403.841603] [<ffffffff802b1ac0>] do_mount+0x1bd/0x1e7 [858403.841603] [<ffffffff802769a1>] __alloc_pages_internal+0xd6/0x3bf [858403.841603] [<ffffffff802b1b74>] sys_mount+0x8a/0xce [858403.841603] [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f [858403.841603] [858403.841603] Failed to recover EFIs on filesystem: sda1 [858403.841603] XFS: log mount finish failed Output from xfs_check -v: ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_check. If you are unable to mount the filesystem, then use the xfs_repair -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. Output from xfs_repair -n produced a 2GB file, so this is the cleaned up version: - block (1,499522) already used, state 7 (3_636_499 of these) - block (7,480993) multiply claimed by bno space tree, state - (26_547_241 of these) - bno freespace btree block claimed (state 1), agno 7, bno 65565, suspect 0 (158 of these) - bcnt freespace btree block claimed (state 1), agno 7, bno 567395, suspect 0 (175 of these) - data fork in ino 84753919 claims free block 291349280 (4_580_113 of these) - would have junked entry "foo" in directory inode 136 (10_095 of these) - would have corrected i8 count in directory 136 from 2 to 1 (9_016 of these) - entry "foo" at block 0 offset 72 in directory inode 16955069 references non-existent inode 30065663864 would clear inode number in entry at offset 72... (43_379 of these) Phase 1 - find and verify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... bad magic # 0x26c4 in btcnt block 1/302903 expected level 0 got 514 in btcnt block 1/302903 bad magic # 0x26c4 in btbno block 7/604731 expected level 0 got 256 in btbno block 7/604731 bad magic # 0x26c4 in btbno block 9/8428277 expected level 0 got 59755 in btbno block 9/8428277 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... found inodes not in the inode allocation tree found inodes not in the inode allocation tree found inodes not in the inode allocation tree - process known inodes and perform inode discovery... - agno = 0 - agno = 1 bad directory block magic # 0x26c4 in block 0 for directory inode 6159645547 corrupt block 0 in directory inode 6159645547 would junk block no . entry for directory 6159645547 no .. entry for directory 6159645547 problem with directory contents in inode 6159645547 would have cleared inode 6159645547 - agno = 2 - agno = 3 - agno = 4 bad directory block magic # 0x6173733d in block 0 for directory inode 19126674939 corrupt block 0 in directory inode 19126674939 would junk block no . entry for directory 19126674939 no .. entry for directory 19126674939 problem with directory contents in inode 19126674939 would have cleared inode 19126674939 - agno = 5 - agno = 6 - agno = 7 42189950: Badness in key lookup (length) bp=(bno 15170340024, len 16384 bytes) key=(bno 15170340024, len 8192 bytes) - agno = 8 bad directory block magic # 0x45b419cb in block 0 for directory inode 35775783660 corrupt block 0 in directory inode 35775783660 would junk block no . entry for directory 35775783660 no .. entry for directory 35775783660 problem with directory contents in inode 35775783660 would have cleared inode 35775783660 - agno = 9 - agno = 10 bad nblocks 20513 for inode 43585639210, would reset to 15192 bad nextents 37 for inode 43585639210, would reset to 32 - agno = 11 - agno = 12 bad directory block magic # 0x58443244 in block 0 for directory inode 51803060746 corrupt block 0 in directory inode 51803060746 would junk block no . entry for directory 51803060746 no .. entry for directory 51803060746 problem with directory contents in inode 51803060746 would have cleared inode 51803060746 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 bad directory block magic # 0x26c4 in block 0 for directory inode 6159645547 corrupt block 0 in directory inode 6159645547 would junk block no . entry for directory 6159645547 no .. entry for directory 6159645547 problem with directory contents in inode 6159645547 would have cleared inode 6159645547 - agno = 2 - agno = 3 - agno = 4 bad directory block magic # 0x6173733d in block 0 for directory inode 19126674939 corrupt block 0 in directory inode 19126674939 would junk block no . entry for directory 19126674939 no .. entry for directory 19126674939 problem with directory contents in inode 19126674939 would have cleared inode 19126674939 - agno = 5 - agno = 6 - agno = 7 - agno = 8 bad directory block magic # 0x45b419cb in block 0 for directory inode 35775783660 corrupt block 0 in directory inode 35775783660 would junk block no . entry for directory 35775783660 no .. entry for directory 35775783660 problem with directory contents in inode 35775783660 would have cleared inode 35775783660 - agno = 9 - agno = 10 bad nblocks 20513 for inode 43585639210, would reset to 15192 bad nextents 37 for inode 43585639210, would reset to 32 - agno = 11 - agno = 12 bad directory block magic # 0x58443244 in block 0 for directory inode 51803060746 corrupt block 0 in directory inode 51803060746 would junk block no . entry for directory 51803060746 no .. entry for directory 51803060746 problem with directory contents in inode 51803060746 would have cleared inode 51803060746 No modify flag set, skipping phase 5 Inode allocation btrees are too corrupted, skipping phases 6 and 7 No modify flag set, skipping filesystem flush and exiting. I did run "xfs_repair -L" on an image of the filesystem on another server and I ended up with about 50000 entries in lost+found (~750000 entries recursively). Attaching output from "xfs_logprint -t" and a xfs_metadump can be made available. Is there any way to diagnose and salvage this? Any and all help is much appreciated. Best regards Erik Gulliksson
Attachment:
xfs_logprint-t.txt.gz
Description: GNU Zip compressed data
_______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs