Re: [PATCH 3/3] xfs: refactor xfs_buf_log_item reference count handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 27, 2018 at 10:25:48AM -0400, Brian Foster wrote:
> The xfs_buf_log_item structure has a reference counter with slightly
> tricky semantics. In the common case, a buffer is logged and
> committed in a transaction, committed to the on-disk log (added to
> the AIL) and then finally written back and removed from the AIL. The
> bli refcount covers two potentially overlapping timeframes:
> 
>  1. the bli is held in an active transaction
>  2. the bli is pinned by the log
> 
> The caveat to this approach is that the reference counter does not
> purely dictate the lifetime of the bli. IOW, when a dirty buffer is
> physically logged and unpinned, the bli refcount may go to zero as
> the log item is inserted into the AIL. Only once the buffer is
> written back can the bli finally be freed.
> 
> The above semantics means that it is not enough for the various
> refcount decrementing contexts to release the bli on decrement to
> zero. xfs_trans_brelse(), transaction commit (->iop_unlock()) and
> unpin (->iop_unpin()) must all drop the associated reference and
> make additional checks to determine if the current context is
> responsible for freeing the item.
> 
> For example, if a transaction holds but does not dirty a particular
> bli, the commit may drop the refcount to zero. If the bli itself is
> clean, it is also not AIL resident and must be freed at this time.
> The same is true for xfs_trans_brelse(). If the transaction dirties
> a bli and then aborts or an unpin results in an abort due to a log
> I/O error, the last reference count holder is expected to explicitly
> remove the item from the AIL and release it (since an abort means
> filesystem shutdown and metadata writeback will never occur).
> 
> This leads to fairly complex checks being replicated in a few
> different places. Since ->iop_unlock() and xfs_trans_brelse() are
> nearly identical, refactor the logic into a common helper that
> implements and documents the semantics in one place. This patch does
> not change behavior.

I think this improves the code further, because now we don't have to
care about the buf item cleanup when dropping references.

> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_buf_item.c  | 85 ++++++++++++++++++++++++------------------
>  fs/xfs/xfs_buf_item.h  |  1 +
>  fs/xfs/xfs_trans_buf.c | 22 +----------
>  3 files changed, 51 insertions(+), 57 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index b39d1b5320e7..d0191451fe18 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -556,13 +556,12 @@ xfs_buf_item_unlock(
>  {
>  	struct xfs_buf_log_item	*bip = BUF_ITEM(lip);
>  	struct xfs_buf		*bp = bip->bli_buf;
> -	bool			freed;
> -	bool			aborted;
> +	bool			released;
>  	bool			hold = bip->bli_flags & XFS_BLI_HOLD;
> -	bool			dirty = bip->bli_flags & XFS_BLI_DIRTY;
>  	bool			stale = bip->bli_flags & XFS_BLI_STALE;
>  #if defined(DEBUG) || defined(XFS_WARN)
>  	bool			ordered = bip->bli_flags & XFS_BLI_ORDERED;
> +	bool			dirty = bip->bli_flags & XFS_BLI_DIRTY;
>  #endif
>  	trace_xfs_buf_item_unlock(bip);
>  
> @@ -574,8 +573,6 @@ xfs_buf_item_unlock(
>  	       (ordered && dirty && !xfs_buf_item_dirty_format(bip)));
>  	ASSERT(!stale || (bip->__bli_format.blf_flags & XFS_BLF_CANCEL));
>  
> -	aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags);
> -
>  	/*
>  	 * Clear the buffer's association with this transaction and
>  	 * per-transaction state from the bli, which has been copied above.
> @@ -584,40 +581,16 @@ xfs_buf_item_unlock(
>  	bip->bli_flags &= ~(XFS_BLI_LOGGED | XFS_BLI_HOLD | XFS_BLI_ORDERED);
>  
>  	/*
> -	 * Drop the transaction's bli reference and deal with the item if we had
> -	 * the last one. We must free the item if clean or aborted since it
> -	 * wasn't pinned by the log and this is the last chance to do so. If the
> -	 * bli is freed and dirty (but non-aborted), the buffer was not dirty in
> -	 * this transaction but modified by a previous one and still awaiting
> -	 * writeback. In that case, the bli is freed on buffer writeback
> -	 * completion.
> +	 * Unref the item and unlock the buffer unless held or stale. Stale
> +	 * buffers remain locked until final unpin unless the bli is freed by
> +	 * the unref call. The latter implies shutdown because buffer
> +	 * invalidation dirties the bli and transaction.
>  	 */
> -	freed = atomic_dec_and_test(&bip->bli_refcount);
> -	if (freed) {
> -		ASSERT(!aborted || XFS_FORCED_SHUTDOWN(lip->li_mountp));
> -		/*
> -		 * An aborted item may be in the AIL regardless of dirty state.
> -		 * For example, consider an aborted transaction that invalidated
> -		 * a dirty bli and cleared the dirty state.
> -		 */
> -		if (aborted)
> -			xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
> -		if (aborted || !dirty)
> -			xfs_buf_item_relse(bp);
> -	} else if (stale) {
> -		/*
> -		 * Stale buffers remain locked until final unpin unless the bli
> -		 * was freed in the branch above. A freed stale bli implies an
> -		 * abort because buffer invalidation dirties the bli and
> -		 * transaction.
> -		 */
> -		ASSERT(!freed);
> +	released = xfs_buf_item_unref(bip);
> +	if ((stale && !released) || hold)
>  		return;

Personal preference, but I'd put the hold check first because it's
not a compound logic statement.

> +bool
> +xfs_buf_item_unref(

"+xfs_buf_item_put()" to be consistent with dropping references on
other reference counted objects (e.g.  iput, dput, bio_put,
xfs_perag_put, xfs_log_ticket_put, etc).

> +	struct xfs_buf_log_item	*bip)
> +{
> +	struct xfs_log_item	*lip = &bip->bli_item;
> +	bool			aborted;
> +	bool			dirty;
> +	bool			freed;
> +
> +	/* drop the bli ref and return if it wasn't the last one */
> +	freed = atomic_dec_and_test(&bip->bli_refcount);
> +	if (!freed)
> +		return false;

We don't really need the freed variable here. This

	if (!atomic_dec_and_test(refcount))
		return false

is a common enough "do something only when we drop the last
reference" pattern that everyone should understand it by
now.

> +	 * If the bli is dirty and non-aborted, the buffer was clean in the
> +	 * transaction but still awaiting writeback from previous changes. In
> +	 * that case, the bli is freed on buffer writeback completion.
> +	 */
> +	aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags) ||
> +		  XFS_FORCED_SHUTDOWN(lip->li_mountp);
> +	dirty = bip->bli_flags & XFS_BLI_DIRTY;
> +	if (!aborted && dirty)
> +		return false;

Personal preference (again) is to check dirty first - same as the
comment text mentions "dirty and not-aborted"...

These are all minor things - I think it's a good improvement
overall.

Cheers,

dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux