Re: [PATCH 2/3] xfs: periodically relog deferred intent items

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 17, 2020 at 04:11:48PM +1000, Dave Chinner wrote:
> On Wed, Sep 16, 2020 at 08:30:00PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > 
> > There's a subtle design flaw in the deferred log item code that can lead
> > to pinning the log tail.  Taking up the defer ops chain examples from
> > the previous commit, we can get trapped in sequences like this:
> > 
> > Caller hands us a transaction t0 with D0-D3 attached.  The defer ops
> > chain will look like the following if the transaction rolls succeed:
> > 
> > t1: D0(t0), D1(t0), D2(t0), D3(t0)
> > t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0)
> > t3: d5(t1), D1(t0), D2(t0), D3(t0)
> > ...
> > t9: d9(t7), D3(t0)
> > t10: D3(t0)
> > t11: d10(t10), d11(t10)
> > t12: d11(t10)
> > 
> > In transaction 9, we finish d9 and try to roll to t10 while holding onto
> > an intent item for D3 that we logged in t0.
> > 
> > The previous commit changed the order in which we place new defer ops in
> > the defer ops processing chain to reduce the maximum chain length.  Now
> > make xfs_defer_finish_noroll capable of relogging the entire chain
> > periodically so that we can always move the log tail forward.  We do
> > this every seven loops, having observed that while most chains never
> > exceed seven items in length, the rest go far over that and seem to
> > be involved in most of the stall problems.
> > 
> > Callers are now required to ensure that the transaction reservation is
> > large enough to handle logging done items and new intent items for the
> > maximum possible chain length.  Most callers are careful to keep the
> > chain lengths low, so the overhead should be minimal.
> > 
> > (Note that in the next patch we'll make it so that we only relog on
> > demand, since 7 is an arbitrary number that I used here to get the basic
> > mechanics working.)
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > ---
> >  fs/xfs/libxfs/xfs_defer.c  |   30 ++++++++++++++++++++++++++++++
> >  fs/xfs/xfs_bmap_item.c     |   25 +++++++++++++++++++++++++
> >  fs/xfs/xfs_extfree_item.c  |   29 +++++++++++++++++++++++++++++
> >  fs/xfs/xfs_refcount_item.c |   27 +++++++++++++++++++++++++++
> >  fs/xfs/xfs_rmap_item.c     |   27 +++++++++++++++++++++++++++
> >  fs/xfs/xfs_trace.h         |    1 +
> >  fs/xfs/xfs_trans.h         |   10 ++++++++++
> >  7 files changed, 149 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> > index 84a70edd0da1..7938e4d3af90 100644
> > --- a/fs/xfs/libxfs/xfs_defer.c
> > +++ b/fs/xfs/libxfs/xfs_defer.c
> > @@ -361,6 +361,28 @@ xfs_defer_cancel_list(
> >  	}
> >  }
> >  
> > +/*
> > + * Prevent a log intent item from pinning the tail of the log by logging a
> > + * done item to release the intent item; and then log a new intent item.
> > + * The caller should provide a fresh transaction and roll it after we're done.
> > + */
> > +static int
> > +xfs_defer_relog(
> > +	struct xfs_trans		**tpp,
> > +	struct list_head		*dfops)
> > +{
> > +	struct xfs_defer_pending	*dfp;
> > +
> > +	ASSERT((*tpp)->t_flags & XFS_TRANS_PERM_LOG_RES);
> > +
> > +	list_for_each_entry(dfp, dfops, dfp_list) {
> > +		trace_xfs_defer_relog_intent((*tpp)->t_mountp, dfp);
> > +		dfp->dfp_intent = xfs_trans_item_relog(dfp->dfp_intent, *tpp);
> 
> Any reason for xfs_trans_item_relog() when it's a one liner?

There aren't log intent items in userspace, so xfs_trans_item_relog
becomes a NOP macro in the xfsprogs port.

> > +	}
> > +
> > +	return xfs_defer_trans_roll(tpp);
> > +}
> > +
> >  /*
> >   * Log an intent-done item for the first pending intent, and finish the work
> >   * items.
> > @@ -422,6 +444,7 @@ xfs_defer_finish_noroll(
> >  	struct xfs_trans		**tp)
> >  {
> >  	struct xfs_defer_pending	*dfp;
> > +	unsigned int			nr_rolls = 0;
> >  	int				error = 0;
> >  	LIST_HEAD(dop_pending);
> >  
> > @@ -447,6 +470,13 @@ xfs_defer_finish_noroll(
> >  		if (error)
> >  			goto out_shutdown;
> >  
> > +		/* Every few rolls we relog all the intent items. */
> > +		if (!(++nr_rolls % 7)) {
> > +			error = xfs_defer_relog(tp, &dop_pending);
> > +			if (error)
> > +				goto out_shutdown;
> > +		}
> 
> Urk.
> 
> I think I've got a better idea: rather than a counter, use something
> meaningful as to whether the intent has been committed or not. e.g.
> use something like xfs_log_item_in_current_chkpt() to determine if
> we need to relog the intent.

I'll take a look at that in the morning.

> i.e. If the intent is active in the CIL, then we don't need to relog
> it. If the intent has been committed to the journal and is no longer
> in the CIL list, relog it so the next CIL push will move it forward
> in the journal.
> 
> The intent relogging functions look fine, though.

Thanks for digging through some of these. :)

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux