Re: [PATCH 2/3] xfs: periodically relog deferred intent items

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 16, 2020 at 08:30:00PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> 
> There's a subtle design flaw in the deferred log item code that can lead
> to pinning the log tail.  Taking up the defer ops chain examples from
> the previous commit, we can get trapped in sequences like this:
> 
> Caller hands us a transaction t0 with D0-D3 attached.  The defer ops
> chain will look like the following if the transaction rolls succeed:
> 
> t1: D0(t0), D1(t0), D2(t0), D3(t0)
> t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0)
> t3: d5(t1), D1(t0), D2(t0), D3(t0)
> ...
> t9: d9(t7), D3(t0)
> t10: D3(t0)
> t11: d10(t10), d11(t10)
> t12: d11(t10)
> 
> In transaction 9, we finish d9 and try to roll to t10 while holding onto
> an intent item for D3 that we logged in t0.
> 
> The previous commit changed the order in which we place new defer ops in
> the defer ops processing chain to reduce the maximum chain length.  Now
> make xfs_defer_finish_noroll capable of relogging the entire chain
> periodically so that we can always move the log tail forward.  We do
> this every seven loops, having observed that while most chains never
> exceed seven items in length, the rest go far over that and seem to
> be involved in most of the stall problems.
> 
> Callers are now required to ensure that the transaction reservation is
> large enough to handle logging done items and new intent items for the
> maximum possible chain length.  Most callers are careful to keep the
> chain lengths low, so the overhead should be minimal.
> 
> (Note that in the next patch we'll make it so that we only relog on
> demand, since 7 is an arbitrary number that I used here to get the basic
> mechanics working.)
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> ---
>  fs/xfs/libxfs/xfs_defer.c  |   30 ++++++++++++++++++++++++++++++
>  fs/xfs/xfs_bmap_item.c     |   25 +++++++++++++++++++++++++
>  fs/xfs/xfs_extfree_item.c  |   29 +++++++++++++++++++++++++++++
>  fs/xfs/xfs_refcount_item.c |   27 +++++++++++++++++++++++++++
>  fs/xfs/xfs_rmap_item.c     |   27 +++++++++++++++++++++++++++
>  fs/xfs/xfs_trace.h         |    1 +
>  fs/xfs/xfs_trans.h         |   10 ++++++++++
>  7 files changed, 149 insertions(+)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> index 84a70edd0da1..7938e4d3af90 100644
> --- a/fs/xfs/libxfs/xfs_defer.c
> +++ b/fs/xfs/libxfs/xfs_defer.c
> @@ -361,6 +361,28 @@ xfs_defer_cancel_list(
>  	}
>  }
>  
> +/*
> + * Prevent a log intent item from pinning the tail of the log by logging a
> + * done item to release the intent item; and then log a new intent item.
> + * The caller should provide a fresh transaction and roll it after we're done.
> + */
> +static int
> +xfs_defer_relog(
> +	struct xfs_trans		**tpp,
> +	struct list_head		*dfops)
> +{
> +	struct xfs_defer_pending	*dfp;
> +
> +	ASSERT((*tpp)->t_flags & XFS_TRANS_PERM_LOG_RES);
> +
> +	list_for_each_entry(dfp, dfops, dfp_list) {
> +		trace_xfs_defer_relog_intent((*tpp)->t_mountp, dfp);
> +		dfp->dfp_intent = xfs_trans_item_relog(dfp->dfp_intent, *tpp);

Any reason for xfs_trans_item_relog() when it's a one liner?

> +	}
> +
> +	return xfs_defer_trans_roll(tpp);
> +}
> +
>  /*
>   * Log an intent-done item for the first pending intent, and finish the work
>   * items.
> @@ -422,6 +444,7 @@ xfs_defer_finish_noroll(
>  	struct xfs_trans		**tp)
>  {
>  	struct xfs_defer_pending	*dfp;
> +	unsigned int			nr_rolls = 0;
>  	int				error = 0;
>  	LIST_HEAD(dop_pending);
>  
> @@ -447,6 +470,13 @@ xfs_defer_finish_noroll(
>  		if (error)
>  			goto out_shutdown;
>  
> +		/* Every few rolls we relog all the intent items. */
> +		if (!(++nr_rolls % 7)) {
> +			error = xfs_defer_relog(tp, &dop_pending);
> +			if (error)
> +				goto out_shutdown;
> +		}

Urk.

I think I've got a better idea: rather than a counter, use something
meaningful as to whether the intent has been committed or not. e.g.
use something like xfs_log_item_in_current_chkpt() to determine if
we need to relog the intent.

i.e. If the intent is active in the CIL, then we don't need to relog
it. If the intent has been committed to the journal and is no longer
in the CIL list, relog it so the next CIL push will move it forward
in the journal.

The intent relogging functions look fine, though.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux