Re: [PATCH 4/3 v2] xfs: async CIL flushes need pending pushes to be made stable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 08, 2022 at 05:12:08PM +1100, Dave Chinner wrote:
> The fix is two-fold - first we should always set the
> cil->xc_push_commit_stable when xlog_cil_flush() is called,
> regardless of whether there is already a pending push or not.
> 
> Second, if the CIL is empty, we should trigger an iclog flush to
> ensure that all the iclogs of the last checkpoint have actually been
> submitted to disk as that checkpoint may not have been run under
> stable completion constraints.
> 
> Reported-by: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> Fixes: 0020a190cf3e ("xfs: AIL needs asynchronous CIL forcing")
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---
> Version 2:
> - if the CIL is empty we need to push on the iclogs to ensure that
>   the previous CIL flush did actually make it to stable storage.
>   This closes a race condition where the AIL doesn't actually
>   trigger the last CIL push and so the CIL is already empty by the
>   time it tries to flush it to unpin pinned items.
> 
> Note: this is against the XFS CIL scalability patchset. The
> XLOG_CIL_EMPTY check on a mainline kernel will be:
> 
> 	if (list_empty(&log->l_cilp->xc_cil))
> 		xfs_log_force(log->l_mp, 0);

This is it.  100 runs of g/530 all completed in 2 or 3 seconds.  Thanks!

Tested-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>

Just to be clear, my testing was done on b08968f196d4 (in Linus' tree),
plus my fs-folio patches, plus patches 1-3 from this series, plus this
(whitespace damaged):

+++ b/fs/xfs/xfs_log_cil.c
@@ -1243,18 +1243,27 @@ xlog_cil_push_now(
        if (!async)
                flush_workqueue(cil->xc_push_wq);

+       spin_lock(&cil->xc_push_lock);
+
+       /*
+        * If this is an async flush request, we always need to set the
+        * xc_push_commit_stable flag even if something else has already queued
+        * a push. The flush caller is asking for the CIL to be on stable
+        * storage when the next push completes, so regardless of who has queued
+        * the push, the flush requires stable semantics from it.
+        */
+       cil->xc_push_commit_stable = async;
+
        /*
         * If the CIL is empty or we've already pushed the sequence then
-        * there's no work we need to do.
+        * there's no more work that we need to do.
         */
-       spin_lock(&cil->xc_push_lock);
        if (list_empty(&cil->xc_cil) || push_seq <= cil->xc_push_seq) {
                spin_unlock(&cil->xc_push_lock);
                return;
        }

        cil->xc_push_seq = push_seq;
-       cil->xc_push_commit_stable = async;
        queue_work(cil->xc_push_wq, &cil->xc_ctx->push_work);
        spin_unlock(&cil->xc_push_lock);
 }
@@ -1352,6 +1361,8 @@ xlog_cil_flush(

        trace_xfs_log_force(log->l_mp, seq, _RET_IP_);
        xlog_cil_push_now(log, seq, true);
+        if (list_empty(&log->l_cilp->xc_cil))
+                xfs_log_force(log->l_mp, 0);
 }

 /*




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux