On Wed, Apr 03, 2013 at 02:09:21PM +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > Filesystems are occasionally being shut down with this error: > > xfs_trans_ail_delete_bulk: attempting to delete a log item that is > not in the AIL. > > It was diagnosed to be related to the EFI/EFD commit order when the > EFI and EFD are in different checkpoints and the EFD is committed > before the EFI here: > > http://oss.sgi.com/archives/xfs/2013-01/msg00082.html > > The real problem is that a single bit cannot fully describe the > states that the EFI/EFD processing can be in. These completion > states are: > > EFI EFI in AIL EFD Result > committed/unpinned Yes committed OK > committed/pinned No committed Shutdown > uncommitted No committed Shutdown > > > Note that the "result" field is what should happen, not what does > happen. The current logic is broken and handles the first two cases > correctly by luck. That is, the code will free the EFI if the > XFS_EFI_COMMITTED bit is *not* set, rather than if it is set. The > inverted logic "works" because if both EFI and EFD are committed, > then the first __xfs_efi_release() call clears the XFS_EFI_COMMITTED > bit, and the second frees the EFI item. Hence as long as > xfs_efi_item_committed() has been called, everything appears to be > fine. > > It is the third case where the logic fails - where > xfs_efd_item_committed() is called before xfs_efi_item_committed(), > and that results in the EFI being freed before it has been > committed. That is the bug that triggered the shutdown, and hence > keeping track of whether the EFI has been committed or not is > insufficient to correctly order the EFI/EFD operations w.r.t. the > AIL. > > What we really want is this: the EFI is always placed into the > AIL before the last reference goes away. The only way to guarantee > that is that the EFI is not freed until after it has been unpinned > *and* the EFD has been committed. That is, restructure the logic so > that the only case that can occur is the first case. > > This can be done easily by replacing the XFS_EFI_COMMITTED with an > EFI reference count. The EFI is initialised with it's own count, and > that is not released until it is unpinned. However, there is a > complication to this method - the high level EFI/EFD code in > xfs_bmap_finish() does not hold direct references to the EFI > structure, and runs a transaction commit between the EFI and EFD > processing. Hence the EFI can be freed even before the EFD is > created using such a method. > > Further, log recovery uses the AIL for tracking EFI/EFDs that need > to be recovered, but it uses the AIL *differently* to the EFI > transaction commit. Hence log recovery never pins or unpins EFIs, so > we can't drop the EFI reference count indirectly to free the EFI. > > However, this doesn't prevent us from using a reference count here. > There is a 1:1 relationship between EFIs and EFDs, so when we > initialise the EFI we can take a reference count for the EFD as > well. This solves the xfs_bmap_finish() issue - the EFI will never > be freed until the EFD is processed. In terms of log recovery, > during the committing of the EFD we can look for the > XFS_EFI_RECOVERED bit being set and drop the EFI reference as well, > thereby ensuring everything works correctly there as well. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Applied. Regards, Ben _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs