From: Darrick J. Wong <djwong@xxxxxxxxxx> While debugging some very strange rmap corruption reports in connection with the online directory repair code. I root-caused the error to the following incorrect sequence: <start repair transaction> <expand directory, causing a deferred rmap to be queued> <roll transaction> <cancel transaction> Obviously, we should have committed the transaction instead of cancelling it. Thinking more broadly, however, xfs_trans_cancel should have warned us that we were throwing away work item that we already committed to performing. This is not correct, and we need to shut down the filesystem. Change xfs_trans_cancel to complain in the loudest manner if we're cancelling any transaction with deferred work items attached. Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> --- libxfs/trans.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/libxfs/trans.c b/libxfs/trans.c index fd2e6f9d..8c16cb8d 100644 --- a/libxfs/trans.c +++ b/libxfs/trans.c @@ -318,13 +318,30 @@ void libxfs_trans_cancel( struct xfs_trans *tp) { + bool dirty; + trace_xfs_trans_cancel(tp, _RET_IP_); if (tp == NULL) return; + dirty = (tp->t_flags & XFS_TRANS_DIRTY); - if (tp->t_flags & XFS_TRANS_PERM_LOG_RES) + /* + * It's never valid to cancel a transaction with deferred ops attached, + * because the transaction is effectively dirty. Complain about this + * loudly before freeing the in-memory defer items. + */ + if (!list_empty(&tp->t_dfops)) { + ASSERT(list_empty(&tp->t_dfops)); + ASSERT(tp->t_flags & XFS_TRANS_PERM_LOG_RES); + dirty = true; xfs_defer_cancel(tp); + } + + if (dirty) { + fprintf(stderr, _("Cancelling dirty transaction!\n")); + abort(); + } xfs_trans_free_items(tp); xfs_trans_free(tp);