On 4/18/13 8:23 AM, 符永涛 wrote: > Hi Brian and Eric, > The shutdown is not easy to produce but finally right now 2 of our servers in our test cluster xfs was shutdown. > > the trace output as following > https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing > > Sorry but the systemtap is interrupt and I didn't noticed that so I didn't get systemtap logs. > > /var/log/message is same as before > Apr 18 22:43:14 10 kernel: XFS (sdb): : xfs_inotobp() returned error 22. > Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned error 22 > Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02d44aa > Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down filesystem > Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and rectify the problem(s) > Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned. > > The metadump file is large I'll share it to you soon. > Thanks, we'll take a look. Just to double check, in the kernel that ran the tracepoints, did you use brian's 2nd version of the patch? I want to make sure the tracepoints were at the top of the function. Since you're patching xfs anyway, can you add something like this for next time: diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 796edce..cad0e8e 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -1777,8 +1777,9 @@ xfs_iunlink_remove( &last_ibp, &last_offset, 0); if (error) { xfs_warn(mp, - "%s: xfs_inotobp() returned error %d.", - __func__, error); + "%s: xfs_inotobp() returned error %d " + "for inode 0x%llx ag %d agino %x\n", + __func__, error, ip->i_ino, agno, agino); return error; } next_agino = be32_to_cpu(last_dip->di_next_unlinked); so that when we encounter the error we're sure to have the problematic inode number. Thanks, -Eric _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs