On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote: > Hi, > > Thanks for the data in the previous thread: > > http://oss.sgi.com/archives/xfs/2013-04/msg00327.html > > I'm spinning off a new thread specifically for this because the original > thread is already too large and scattered to track. As Eric stated, > please try to keep data contained in as few messages as possible. > > The data confirms Dave's theory where we are going off the end of the > unlinked list when attempting to remove an inode, pass in NULLAGINO to > xfs_inotobp() and the attempted conversion to a global inode number > leads to EINVAL. The next question here is why wasn't the inode listed > in the probe output on the unlinked inode list? > > Unfortunately we're probably going to require to start making some > debug-level changes to the kernel to make progress on this issue. If you > are able to recompile a kernel and/or xfs module (which you referred to > doing in the previous thread), could you start with the patch appended > to this message[1] and collect the xfs_iunlink and xfs_iunlink_remove > tracepoint data the next time the problem occurs? E.g., > > echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable > echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable > ... reproduce ... > cat /sys/kernel/debug/tracing/trace > trace.output It's better to use trace-cmd for this. it will result in less dropped events. i.e.: $ trace-cmd record -e xfs_iunlink\* ... reproduce ... ^C $ trace-cmd report > trace.output > --- a/fs/xfs/linux-2.6/xfs_trace.h > +++ b/fs/xfs/linux-2.6/xfs_trace.h > @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync); > DEFINE_INODE_EVENT(xfs_destroy_inode); > DEFINE_INODE_EVENT(xfs_write_inode); > DEFINE_INODE_EVENT(xfs_clear_inode); > +DEFINE_INODE_EVENT(xfs_iunlink); > +DEFINE_INODE_EVENT(xfs_iunlink_remove); > > DEFINE_INODE_EVENT(xfs_dquot_dqalloc); > DEFINE_INODE_EVENT(xfs_dquot_dqdetach); > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > index 796edce..a43bec5 100644 > --- a/fs/xfs/xfs_inode.c > +++ b/fs/xfs/xfs_inode.c > @@ -1670,6 +1670,8 @@ xfs_iunlink( > (sizeof(xfs_agino_t) * bucket_index); > xfs_trans_log_buf(tp, agibp, offset, > (offset + sizeof(xfs_agino_t) - 1)); > + > + trace_xfs_iunlink(ip); > return 0; > } > > @@ -1820,6 +1822,8 @@ xfs_iunlink_remove( > (offset + sizeof(xfs_agino_t) - 1)); > xfs_inobp_check(mp, last_ibp); > } > + > + trace_xfs_iunlink_remove(ip); > return 0; I would suggest that the the tracing shoul dbe at entry of the function, otherwise we won't get a tracepoint for the operation that triggers the shutdown. (That's the reason most tracepoints in XFS are at function entry...) Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs