Re: [PATCH 3/3] xfs: refactor xfs_buf_log_item reference count handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 29, 2018 at 08:59:09AM +1000, Dave Chinner wrote:
> On Mon, Aug 27, 2018 at 10:25:48AM -0400, Brian Foster wrote:
> > The xfs_buf_log_item structure has a reference counter with slightly
> > tricky semantics. In the common case, a buffer is logged and
> > committed in a transaction, committed to the on-disk log (added to
> > the AIL) and then finally written back and removed from the AIL. The
> > bli refcount covers two potentially overlapping timeframes:
> > 
> >  1. the bli is held in an active transaction
> >  2. the bli is pinned by the log
> > 
> > The caveat to this approach is that the reference counter does not
> > purely dictate the lifetime of the bli. IOW, when a dirty buffer is
> > physically logged and unpinned, the bli refcount may go to zero as
> > the log item is inserted into the AIL. Only once the buffer is
> > written back can the bli finally be freed.
> > 
> > The above semantics means that it is not enough for the various
> > refcount decrementing contexts to release the bli on decrement to
> > zero. xfs_trans_brelse(), transaction commit (->iop_unlock()) and
> > unpin (->iop_unpin()) must all drop the associated reference and
> > make additional checks to determine if the current context is
> > responsible for freeing the item.
> > 
> > For example, if a transaction holds but does not dirty a particular
> > bli, the commit may drop the refcount to zero. If the bli itself is
> > clean, it is also not AIL resident and must be freed at this time.
> > The same is true for xfs_trans_brelse(). If the transaction dirties
> > a bli and then aborts or an unpin results in an abort due to a log
> > I/O error, the last reference count holder is expected to explicitly
> > remove the item from the AIL and release it (since an abort means
> > filesystem shutdown and metadata writeback will never occur).
> > 
> > This leads to fairly complex checks being replicated in a few
> > different places. Since ->iop_unlock() and xfs_trans_brelse() are
> > nearly identical, refactor the logic into a common helper that
> > implements and documents the semantics in one place. This patch does
> > not change behavior.
> 
> I think this improves the code further, because now we don't have to
> care about the buf item cleanup when dropping references.
> 
> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > ---
> >  fs/xfs/xfs_buf_item.c  | 85 ++++++++++++++++++++++++------------------
> >  fs/xfs/xfs_buf_item.h  |  1 +
> >  fs/xfs/xfs_trans_buf.c | 22 +----------
> >  3 files changed, 51 insertions(+), 57 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> > index b39d1b5320e7..d0191451fe18 100644
> > --- a/fs/xfs/xfs_buf_item.c
> > +++ b/fs/xfs/xfs_buf_item.c
> > @@ -556,13 +556,12 @@ xfs_buf_item_unlock(
> >  {
> >  	struct xfs_buf_log_item	*bip = BUF_ITEM(lip);
> >  	struct xfs_buf		*bp = bip->bli_buf;
> > -	bool			freed;
> > -	bool			aborted;
> > +	bool			released;
> >  	bool			hold = bip->bli_flags & XFS_BLI_HOLD;
> > -	bool			dirty = bip->bli_flags & XFS_BLI_DIRTY;
> >  	bool			stale = bip->bli_flags & XFS_BLI_STALE;
> >  #if defined(DEBUG) || defined(XFS_WARN)
> >  	bool			ordered = bip->bli_flags & XFS_BLI_ORDERED;
> > +	bool			dirty = bip->bli_flags & XFS_BLI_DIRTY;
> >  #endif
> >  	trace_xfs_buf_item_unlock(bip);
> >  
> > @@ -574,8 +573,6 @@ xfs_buf_item_unlock(
> >  	       (ordered && dirty && !xfs_buf_item_dirty_format(bip)));
> >  	ASSERT(!stale || (bip->__bli_format.blf_flags & XFS_BLF_CANCEL));
> >  
> > -	aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags);
> > -
> >  	/*
> >  	 * Clear the buffer's association with this transaction and
> >  	 * per-transaction state from the bli, which has been copied above.
> > @@ -584,40 +581,16 @@ xfs_buf_item_unlock(
> >  	bip->bli_flags &= ~(XFS_BLI_LOGGED | XFS_BLI_HOLD | XFS_BLI_ORDERED);
> >  
> >  	/*
> > -	 * Drop the transaction's bli reference and deal with the item if we had
> > -	 * the last one. We must free the item if clean or aborted since it
> > -	 * wasn't pinned by the log and this is the last chance to do so. If the
> > -	 * bli is freed and dirty (but non-aborted), the buffer was not dirty in
> > -	 * this transaction but modified by a previous one and still awaiting
> > -	 * writeback. In that case, the bli is freed on buffer writeback
> > -	 * completion.
> > +	 * Unref the item and unlock the buffer unless held or stale. Stale
> > +	 * buffers remain locked until final unpin unless the bli is freed by
> > +	 * the unref call. The latter implies shutdown because buffer
> > +	 * invalidation dirties the bli and transaction.
> >  	 */
> > -	freed = atomic_dec_and_test(&bip->bli_refcount);
> > -	if (freed) {
> > -		ASSERT(!aborted || XFS_FORCED_SHUTDOWN(lip->li_mountp));
> > -		/*
> > -		 * An aborted item may be in the AIL regardless of dirty state.
> > -		 * For example, consider an aborted transaction that invalidated
> > -		 * a dirty bli and cleared the dirty state.
> > -		 */
> > -		if (aborted)
> > -			xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
> > -		if (aborted || !dirty)
> > -			xfs_buf_item_relse(bp);
> > -	} else if (stale) {
> > -		/*
> > -		 * Stale buffers remain locked until final unpin unless the bli
> > -		 * was freed in the branch above. A freed stale bli implies an
> > -		 * abort because buffer invalidation dirties the bli and
> > -		 * transaction.
> > -		 */
> > -		ASSERT(!freed);
> > +	released = xfs_buf_item_unref(bip);
> > +	if ((stale && !released) || hold)
> >  		return;
> 
> Personal preference, but I'd put the hold check first because it's
> not a compound logic statement.
> 

Ok.

> > +bool
> > +xfs_buf_item_unref(
> 
> "+xfs_buf_item_put()" to be consistent with dropping references on
> other reference counted objects (e.g.  iput, dput, bio_put,
> xfs_perag_put, xfs_log_ticket_put, etc).
> 

Works for me.

> > +	struct xfs_buf_log_item	*bip)
> > +{
> > +	struct xfs_log_item	*lip = &bip->bli_item;
> > +	bool			aborted;
> > +	bool			dirty;
> > +	bool			freed;
> > +
> > +	/* drop the bli ref and return if it wasn't the last one */
> > +	freed = atomic_dec_and_test(&bip->bli_refcount);
> > +	if (!freed)
> > +		return false;
> 
> We don't really need the freed variable here. This
> 
> 	if (!atomic_dec_and_test(refcount))
> 		return false
> 
> is a common enough "do something only when we drop the last
> reference" pattern that everyone should understand it by
> now.
> 

Yeah, I think I just missed that the associated asserts (where freed was
used) were made unnecessary by the simplified logic.

> > +	 * If the bli is dirty and non-aborted, the buffer was clean in the
> > +	 * transaction but still awaiting writeback from previous changes. In
> > +	 * that case, the bli is freed on buffer writeback completion.
> > +	 */
> > +	aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags) ||
> > +		  XFS_FORCED_SHUTDOWN(lip->li_mountp);
> > +	dirty = bip->bli_flags & XFS_BLI_DIRTY;
> > +	if (!aborted && dirty)
> > +		return false;
> 
> Personal preference (again) is to check dirty first - same as the
> comment text mentions "dirty and not-aborted"...
> 

Makes sense. The dirty/clean state is also the more common/primary
consideration, whereas aborted == true should obviously be a
rare/outlier case.

> These are all minor things - I think it's a good improvement
> overall.
> 

Thanks. I'll send a v3 with the associated tweaks..

Brian

> Cheers,
> 
> dave.
> 
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux