I'm trying to dedupe the two large XFS filesystems on which I have DVR recordings, so that I can walk around amongst the available HDDs and create new filesystems under everything. Every time I rm a file, the filesystem blows up, and the driver shuts it down. Some background: At the moment, I have 2 devices, /dev/sdd1 mounted on /appl/media4, and /dev/sda1 mounted on /appl/media5, and a large script, created by hand- hacking the output of a perl dupe finder script. The large script was mangled so that it would remove anything that was a dupe from media4, unless the file was an unlabeled lost+found on media5, and had a name on media4. In that case, I removed the file on media5, and then moved it from media4 to media5. After the hand-hacking on the script, I sorted it to do all the rm's first, and then all the mv's, to make sure free space when up before it went down. And, of course, when I ran the script, it caused the XFS driver to cough and die, leading to error 5s and gnashing of teeth. I unmounted media5, remounted it (which worked), and unmounted it again to run xfs_repair -n. That found one inode that was pointing somewhere bogus (and I apologize that I can't copy that in; I was running under screen, and it doesn't cooperate with scrollback well). I ran an xfs_repair without -n, and it found and fixed the one error without complaint. I mounted and unmounted it successfully (nothing notable in dmesg), and reran xfs_repair -n, which, this time, ran without any problems reported. So I remounted the filesystem, and again tried to run the script. And again, it tripped something, and the filesystem unmounted, and here's the dmesg output from the first and second trips: First time: [169324.654803] XFS (sdd1): Ending clean mount [1278872.471310] ccbc0000: 41 42 54 42 00 00 00 04 df ff ff ff ff ff ff ff ABTB............ [1278872.471324] XFS (sda1): Internal error xfs_btree_check_sblock at line 119 of file /home/abuild/rpmbuild/BUI LD/kernel-default-3.4.47/linux-3.4/fs/xfs/xfs_btree.c. Caller 0xe3caf3a5 [1278872.471328] [1278872.471334] Pid: 16696, comm: rm Not tainted 3.4.47-2.38-default #1 [1278872.471338] Call Trace: [1278872.471368] [<c0205349>] try_stack_unwind+0x199/0x1b0 [1278872.471382] [<c02041c7>] dump_trace+0x47/0xf0 [1278872.471391] [<c02053ab>] show_trace_log_lvl+0x4b/0x60 [1278872.471398] [<c02053d8>] show_trace+0x18/0x20 [1278872.471409] [<c06825ba>] dump_stack+0x6d/0x72 [1278872.471534] [<e3c826ed>] xfs_corruption_error+0x5d/0x90 [xfs] [1278872.471650] [<e3cae9f4>] xfs_btree_check_sblock+0x74/0x100 [xfs] [1278872.471834] [<e3caf3a5>] xfs_btree_read_buf_block.constprop.24+0x95/0xb0 [xfs] [1278872.472007] [<e3caf423>] xfs_btree_lookup_get_block+0x63/0xc0 [xfs] [1278872.472207] [<e3cb251a>] xfs_btree_lookup+0x9a/0x460 [xfs] [1278872.472379] [<e3c9576a>] xfs_alloc_fixup_trees+0x27a/0x370 [xfs] [1278872.472510] [<e3c97b63>] xfs_alloc_ag_vextent_size+0x523/0x670 [xfs] [1278872.472647] [<e3c9874f>] xfs_alloc_ag_vextent+0x9f/0x100 [xfs] [1278872.472781] [<e3c9899a>] xfs_alloc_fix_freelist+0x1ea/0x450 [xfs] [1278872.472915] [<e3c98cd5>] xfs_free_extent+0xd5/0x160 [xfs] [1278872.473052] [<e3ca9f4e>] xfs_bmap_finish+0x15e/0x1b0 [xfs] [1278872.473214] [<e3cc47e9>] xfs_itruncate_extents+0x159/0x2f0 [xfs] [1278872.473422] [<e3c92ff5>] xfs_inactive+0x335/0x4a0 [xfs] [1278872.473516] [<c0337e84>] evict+0x84/0x150 [1278872.473530] [<c032ea22>] do_unlinkat+0x102/0x160 [1278872.473546] [<c069331c>] sysenter_do_call+0x12/0x28 [1278872.473578] [<b779b430>] 0xb779b42f [1278872.473583] XFS (sda1): Corruption detected. Unmount and run xfs_repair [1278872.473599] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 3732 of file /home/abuild/rpmbuild/BUIL D/kernel-default-3.4.47/linux-3.4/fs/xfs/xfs_bmap.c. Return address = 0xe3ca9f8c [1278872.584543] XFS (sda1): Corruption of in-memory data detected. Shutting down filesystem [1278872.584555] XFS (sda1): Please umount the filesystem and rectify the problem(s) [1278881.888038] XFS (sda1): xfs_log_force: error 5 returned. [1278911.968046] XFS (sda1): xfs_log_force: error 5 returned. [1278942.048037] XFS (sda1): xfs_log_force: error 5 returned. [1278972.128049] XFS (sda1): xfs_log_force: error 5 returned. [1279002.208042] XFS (sda1): xfs_log_force: error 5 returned. [1279028.046331] XFS (sda1): xfs_log_force: error 5 returned. [1279028.046349] XFS (sda1): xfs_do_force_shutdown(0x1) called from line 1031 of file /home/abuild/rpmbuild/BUIL D/kernel-default-3.4.47/linux-3.4/fs/xfs/xfs_buf.c. Return address = 0xe3c813c0 [1279028.060676] XFS (sda1): xfs_log_force: error 5 returned. [1279028.067532] XFS (sda1): xfs_log_force: error 5 returned. Here's me mounting and umounting, with the xfs_repair runs in the middle: [1279032.147391] XFS (sda1): Mounting Filesystem [1279032.305924] XFS (sda1): Starting recovery (logdev: internal) [1279035.263630] XFS (sda1): Ending recovery (logdev: internal) [1279238.566041] XFS (sda1): Mounting Filesystem [1279238.713051] XFS (sda1): Ending clean mount [1279286.829764] XFS (sda1): Mounting Filesystem [1279286.982409] XFS (sda1): Ending clean mount [1279368.607644] XFS (sda1): Mounting Filesystem [1279368.755048] XFS (sda1): Ending clean mount Second time: [1279388.664986] c1516000: 41 42 54 43 00 00 00 04 df ff ff ff ff ff ff ff ABTC............ [1279388.665000] XFS (sda1): Internal error xfs_btree_check_sblock at line 119 of file /home/abuild/rpmbuild/BUI LD/kernel-default-3.4.47/linux-3.4/fs/xfs/xfs_btree.c. Caller 0xe3caf3a5 [1279388.665004] [1279388.665010] Pid: 18452, comm: rm Not tainted 3.4.47-2.38-default #1 [1279388.665015] Call Trace: [1279388.665045] [<c0205349>] try_stack_unwind+0x199/0x1b0 [1279388.665058] [<c02041c7>] dump_trace+0x47/0xf0 [1279388.665067] [<c02053ab>] show_trace_log_lvl+0x4b/0x60 [1279388.665075] [<c02053d8>] show_trace+0x18/0x20 [1279388.665086] [<c06825ba>] dump_stack+0x6d/0x72 [1279388.665211] [<e3c826ed>] xfs_corruption_error+0x5d/0x90 [xfs] [1279388.665327] [<e3cae9f4>] xfs_btree_check_sblock+0x74/0x100 [xfs] [1279388.665511] [<e3caf3a5>] xfs_btree_read_buf_block.constprop.24+0x95/0xb0 [xfs] [1279388.665684] [<e3caf423>] xfs_btree_lookup_get_block+0x63/0xc0 [xfs] [1279388.665856] [<e3cb251a>] xfs_btree_lookup+0x9a/0x460 [xfs] [1279388.666029] [<e3c97691>] xfs_alloc_ag_vextent_size+0x51/0x670 [xfs] [1279388.666163] [<e3c9874f>] xfs_alloc_ag_vextent+0x9f/0x100 [xfs] [1279388.666298] [<e3c9899a>] xfs_alloc_fix_freelist+0x1ea/0x450 [xfs] [1279388.666433] [<e3c98cd5>] xfs_free_extent+0xd5/0x160 [xfs] [1279388.666571] [<e3ca9f4e>] xfs_bmap_finish+0x15e/0x1b0 [xfs] [1279388.666734] [<e3cc47e9>] xfs_itruncate_extents+0x159/0x2f0 [xfs] [1279388.666944] [<e3c92ff5>] xfs_inactive+0x335/0x4a0 [xfs] [1279388.667039] [<c0337e84>] evict+0x84/0x150 [1279388.667053] [<c032ea22>] do_unlinkat+0x102/0x160 [1279388.667069] [<c069331c>] sysenter_do_call+0x12/0x28 [1279388.667100] [<b772f430>] 0xb772f42f [1279388.667105] XFS (sda1): Corruption detected. Unmount and run xfs_repair [1279388.667120] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 3732 of file /home/abuild/rpmbuild/BUIL D/kernel-default-3.4.47/linux-3.4/fs/xfs/xfs_bmap.c. Return address = 0xe3ca9f8c [1279388.690497] XFS (sda1): Corruption of in-memory data detected. Shutting down filesystem [1279388.690506] XFS (sda1): Please umount the filesystem and rectify the problem(s) [1279398.816060] XFS (sda1): xfs_log_force: error 5 returned. [1279428.832065] XFS (sda1): xfs_log_force: error 5 returned. [ ... ] It's not entirely clear to me whether this problem is specific inodes that are corrupt or not, or just something in the filesystem header. Kernel: Linux duckling 3.4.47-2.38-default #1 SMP Fri May 31 20:17:40 UTC 2013 (3961086) i686 athlon i386 GNU/Linux progs: xfsprogs-3.1.6-9.1.2.i586 Worst case, if I can't get these to behave, I'll just beg, borrow or steal a spare 3T and copy everything to it, and then redo the FSs on these 2 drives, but it would a bit easier if I could get them to settle down a bit... Anyone have any suggestions as to which mole I should whack next? [ ... ] Built xfsprogs 3.1.11 from GIT, and ran it, and on /appl/media4, /dev/sda1: ============ duckling:/appl/downloads/xfsprogs # xfs_repair /dev/sda1 Phase 1 - find and verify superblock... Not enough RAM available for repair to enable prefetching. This will be _slow_. You need at least 497MB RAM to run with prefetching enabled. Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... ir_freecount/free mismatch, inode chunk 2/128, freecount 62 nfree 61 ir_freecount/free mismatch, inode chunk 3/128, freecount 36 nfree 35 xfs_allocbt_read_verify: XFS_CORRUPTION_ERROR xfs_allocbt_read_verify: XFS_CORRUPTION_ERROR xfs_allocbt_read_verify: XFS_CORRUPTION_ERROR xfs_allocbt_read_verify: XFS_CORRUPTION_ERROR xfs_allocbt_read_verify: XFS_CORRUPTION_ERROR - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 imap claims a free inode 1073742013 is in use, correcting imap and clearing inode cleared inode 1073742013 - agno = 3 imap claims a free inode 1610612893 is in use, correcting imap and clearing inode cleared inode 1610612893 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... __read_verify: XFS_CORRUPTION_ERROR can't read leaf block 8388608 for directory inode 128 rebuilding directory inode 128 name create failed in ino 128 (117), filesystem may be out of space - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done ============ It's not clear to me whether that actually fixed anything or not, but I think I'm going to put off a second run, or a run on the other FS which threw more CORRUPTION errors in a later stage, until I have a better idea what's going on... Cheers, -- jra -- Jay R. Ashworth Baylink jra@xxxxxxxxxxx Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA #natog +1 727 647 1274 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs