Hi david, Thank you for your help. One of our machine is still online so I can only run xfs_repair on one machine. Bellow is the output: xfs_repair -n /dev/sdb |tee xfs_repair.logPhase 1 - find and verify superblock… Phase 2 - using internal log - scan filesystem freespace and inode maps… agi unlinked bucket 1 is 4293569 in ag 0 (inode=4293569) agi unlinked bucket 3 is 4455043 in ag 0 (inode=4455043) agi unlinked bucket 22 is 3949334 in ag 0 (inode=3949334) agi unlinked bucket 24 is 3960984 in ag 0 (inode=3960984) agi unlinked bucket 28 is 4193564 in ag 0 (inode=4193564) agi unlinked bucket 38 is 4722982 in ag 0 (inode=4722982) - found root in ode chunk Phase 3 - for each AG… - scan (but don't clear) agi unlinked lists… - process known inodes and perform inode discovery… - agno = 0 7f2631b97700: Badness in key lookup (length) bp=(bno 1974656, len 16384 bytes) key=(bno 1974656, len 8192 bytes) 7f2631b97700: Badness in key lookup (length) bp=(bno 1980464, len 16384 bytes) key=(bno 1980464, len 8192 bytes) 7f2631b97700: Badness in key lookup (length) bp=(bno 2096752, len 16384 bytes) key=(bno 2096752, len 8192 bytes) 7f2631b97700: Badness in key lookup (length) bp=(bno 2146768, len 16384 bytes) key=(bno 2146768, len 8192 bytes) 7f2631b97700: Badness in key lookup (length) bp=(bno 2227504, len 16384 bytes) key=(bno 2227504, len 8192 bytes) 7f2631b97700: Badness in key lookup (length) bp=(bno 2361472, len 16384 bytes) key=(bno 2361472, len 8192 bytes) - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - process newly discovered in odes.. Phase 4 - check for duplicate blocks… - setting up duplicate extent list… - check for inodes claiming duplicate blocks… - agno = 0 - agno = 1 - agno = 2 - agno = 4 - agno = 5 - agno = 7 - agno = 9 - agno = 6 - agno = 3 - agno = 8 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity… - traversing filesystem … - traversal finished … - moving disconnected inodes to lost+found … disconnected inode 4723622, would move to lost+found Phase 7 - verify link counts… would have reset inode 4723622 nlinks from 0 to 1 No modify flag set, skipping filesystem flush and exiting.. xfs_check /dev/sdb;echo $? [root@10.10.135.25 ~]# xfs_check /dev/sdb;echo $? agi unlinked bucket 1 is 4293569 in ag 0 (inode=4293569) agi unlinked bucket 3 is 4455043 in ag 0 (inode=4455043) agi unlinked bucket 22 is 3949334 in ag 0 (inode=3949334) agi unlinked bucket 24 is 3960984 in ag 0 (inode=3960984) agi unlinked bucket 28 is 4193564 in ag 0 (inode=4193564) agi unlinked bucket 38 is 4722982 in ag 0 (inode=4722982) allocated inode 4723622 has 0 link count 3 xfs_repair -n /dev/sdb;echo $? Phase 1 - find and verify superblock… Phase 2 - using internal log - scan filesystem freespace and inode maps… agi unlinked bucket 1 is 4293569 in ag 0 (inode=4293569) agi unlinked bucket 3 is 4455043 in ag 0 (inode=4455043) agi unlinked bucket 22 is 3949334 in ag 0 (inode=3949334) agi unlinked bucket 24 is 3960984 in ag 0 (inode=3960984) agi unlinked bucket 28 is 4193564 in ag 0 (inode=4193564) agi unlinked bucket 38 is 4722982 in ag 0 (inode=4722982) - found root in ode chunk Phase 3 - for each AG… - scan (but don't clear) agi unlinked lists… - process known inodes and perform inode discovery… - agno = 0 7f35d7d9e700: Badness in key lookup (length) bp=(bno 1974656, len 16384 bytes) key=(bno 1974656, len 8192 bytes) 7f35d7d9e700: Badness in key lookup (length) bp=(bno 1980464, len 16384 bytes) key=(bno 1980464, len 8192 bytes) 7f35d7d9e700: Badness in key lookup (length) bp=(bno 2096752, len 16384 bytes) key=(bno 2096752, len 8192 bytes) 7f35d7d9e700: Badness in key lookup (length) bp=(bno 2146768, len 16384 bytes) key=(bno 2146768, len 8192 bytes) 7f35d7d9e700: Badness in key lookup (length) bp=(bno 2227504, len 16384 bytes) key=(bno 2227504, len 8192 bytes) 7f35d7d9e700: Badness in key lookup (length) bp=(bno 2361472, len 16384 bytes) key=(bno 2361472, len 8192 bytes) - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - process newly discovered in odes.. Phase 4 - check for duplicate blocks… - setting up duplicate extent list… - check for inodes claiming duplicate blocks… - agno = 0 - agno = 2 - agno = 4 - agno = 3 - agno = 7 - agno = 9 - agno = 1 - agno = 8 - agno = 6 - agno = 5 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity… - traversing filesystem … - traversal finished … - moving disconnected inodes to lost+found … disconnected inode 4723622, would move to lost+found Phase 7 - verify link counts… would have reset inode 4723622 nlinks from 0 to 1 No modify flag set, skipping filesystem flush and exiting. 1. 2013/1/6, Dave Chinner <david@xxxxxxxxxxxxx>: > On Sat, Jan 05, 2013 at 09:47:13PM +0800, 符永涛 wrote: >> Dear xfs experts, >> >> We're running glusterfs over top of xfs and recently we have >> encountered xfs filesystem crash two times on two different servers. >> Can you kindly help to give me some insight of how to debug this kind >> of failure? >> Bellow is xfs filesystem crash messages: >> server1: >> Dec 28 16:55:01 localhost kernel: XFS (dm-0): xfs_iunlink_remove: >> xfs_inotobp() returned error 22. >> Dec 28 16:55:01 localhost kernel: XFS (dm-0): xfs_inactive: xfs_ifree >> returned error 22 > > That's a sign of a corrupt inode unlinked list. It's also an oldish > kernel - xfs_inotobp() doesn't exist anymore in the mainline > kernel... > > What does 'xfs_repair -n /dev/dm-0' tell you about the unlinked > lists? > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > -- 符永涛 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs