[PATCH] xfs: close xc_cil list_empty() races with cil commit sequence

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We have seen somewhat rare reports of the following assert from
xlog_cil_push_background() failing during ltp tests or somewhat
innocuous desktop root fs workloads (e.g., virt operations, initramfs
construction):

	ASSERT(!list_empty(&cil->xc_cil));

The reasoning behind the assert is that the transaction has inserted
items to the CIL and hit background push codepath all with
cil->xc_ctx_lock held for reading. This locks out background commit from
emptying the CIL, which acquires the lock for writing. Therefore, the
reasoning is that the items previously inserted in the CIL should still
be present.

The cil->xc_ctx_lock read lock is not sufficient to protect the xc_cil
list, however, due to how CIL insertion is handled.
xlog_cil_insert_items() inserts and reorders the dirty transaction items
to the tail of the CIL under xc_cil_lock. It uses list_move_tail() to
achieve insertion and reordering in the same block of code. This
function removes and reinserts an item to the tail of the list. If a
transaction commits an item that was already logged and thus already
resides in the CIL, and said item is the sole item on the list, the
removal and reinsertion creates a temporary state where the list is
actually empty.

This state is not valid and thus should never be observed by concurrent
transaction commit-side checks in the circumstances outlined above.
Update all of the xc_cil checks to acquire xc_cil_lock before assessing
the state of xc_cil.

Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
---
 fs/xfs/xfs_log_cil.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index abc2ccb..324d449 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -687,7 +687,7 @@ xlog_cil_push_background(
 	 * The cil won't be empty because we are called while holding the
 	 * context lock so whatever we added to the CIL will still be there
 	 */
-	ASSERT(!list_empty(&cil->xc_cil));
+	ASSERT(!xlog_cil_empty(log));
 
 	/*
 	 * don't do a background push if we haven't used up all the
@@ -731,13 +731,17 @@ xlog_cil_push_now(
 	 * there's no work we need to do.
 	 */
 	spin_lock(&cil->xc_push_lock);
+	spin_lock(&cil->xc_cil_lock);
 	if (list_empty(&cil->xc_cil) || push_seq <= cil->xc_push_seq) {
+		spin_unlock(&cil->xc_cil_lock);
 		spin_unlock(&cil->xc_push_lock);
 		return;
 	}
+	spin_unlock(&cil->xc_cil_lock);
 
 	cil->xc_push_seq = push_seq;
 	queue_work(log->l_mp->m_cil_workqueue, &cil->xc_push_work);
+
 	spin_unlock(&cil->xc_push_lock);
 }
 
@@ -749,8 +753,10 @@ xlog_cil_empty(
 	bool		empty = false;
 
 	spin_lock(&cil->xc_push_lock);
+	spin_lock(&cil->xc_cil_lock);
 	if (list_empty(&cil->xc_cil))
 		empty = true;
+	spin_unlock(&cil->xc_cil_lock);
 	spin_unlock(&cil->xc_push_lock);
 	return empty;
 }
@@ -887,12 +893,15 @@ restart:
 	 * it means we haven't yet started the push, because if it had started
 	 * we would have found the context on the committing list.
 	 */
+	spin_lock(&cil->xc_cil_lock);
 	if (sequence == cil->xc_current_sequence &&
 	    !list_empty(&cil->xc_cil)) {
+		spin_unlock(&cil->xc_cil_lock);
 		spin_unlock(&cil->xc_push_lock);
 		goto restart;
 	}
 
+	spin_unlock(&cil->xc_cil_lock);
 	spin_unlock(&cil->xc_push_lock);
 	return commit_lsn;
 
-- 
1.9.3

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux