Re: [PATCH v2] xfs: drain the buf delwri queue before xfsaild idles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 16, 2020 at 06:49:41AM -0400, Brian Foster wrote:
> xfsaild is racy with respect to transaction abort and shutdown in
> that the task can idle or exit with an empty AIL but buffers still
> on the delwri queue. This was partly addressed by cancelling the
> delwri queue before the task exits to prevent memory leaks, but it's
> also possible for xfsaild to empty and idle with buffers on the
> delwri queue. For example, a transaction that pins a buffer that
> also happens to sit on the AIL delwri queue will explicitly remove
> the associated log item from the AIL if the transaction aborts. The
> side effect of this is an unmount hang in xfs_wait_buftarg() as the
> associated buffers remain held by the delwri queue indefinitely.
> This is reproduced on repeated runs of generic/531 with an fs format
> (-mrmapbt=1 -bsize=1k) that happens to also reproduce transaction
> aborts.
> 
> Update xfsaild to not idle until both the AIL and associated delwri
> queue are empty and update the push code to continue delwri queue
> submission attempts even when the AIL is empty. This allows the AIL
> to eventually release aborted buffers stranded on the delwri queue
> when they are unlocked by the associated transaction. This should
> have no significant effect on normal runtime behavior because the
> xfsaild currently idles only when the AIL is empty and in practice
> the AIL is rarely empty with a populated delwri queue. The items
> must be AIL resident to land in the queue in the first place and
> generally aren't removed until writeback completes.
> 
> Note that the pre-existing delwri queue cancel logic in the exit
> path is retained because task stop is external, could technically
> come at any point, and xfsaild is still responsible to release its
> buffer references before it exits.
> 
> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> Reviewed-by: Christoph Hellwig <hch@xxxxxx>

Looks fine to me...
Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>

--D

> ---
> 
> v2:
> - Move the out_done label up a bit.
> v1: https://lore.kernel.org/linux-xfs/20200715123835.8690-1-bfoster@xxxxxxxxxx/
> 
>  fs/xfs/xfs_trans_ail.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> index c3be6e440134..0c783d339675 100644
> --- a/fs/xfs/xfs_trans_ail.c
> +++ b/fs/xfs/xfs_trans_ail.c
> @@ -448,16 +448,10 @@ xfsaild_push(
>  	target = ailp->ail_target;
>  	ailp->ail_target_prev = target;
>  
> +	/* we're done if the AIL is empty or our push has reached the end */
>  	lip = xfs_trans_ail_cursor_first(ailp, &cur, ailp->ail_last_pushed_lsn);
> -	if (!lip) {
> -		/*
> -		 * If the AIL is empty or our push has reached the end we are
> -		 * done now.
> -		 */
> -		xfs_trans_ail_cursor_done(&cur);
> -		spin_unlock(&ailp->ail_lock);
> +	if (!lip)
>  		goto out_done;
> -	}
>  
>  	XFS_STATS_INC(mp, xs_push_ail);
>  
> @@ -539,6 +533,8 @@ xfsaild_push(
>  			break;
>  		lsn = lip->li_lsn;
>  	}
> +
> +out_done:
>  	xfs_trans_ail_cursor_done(&cur);
>  	spin_unlock(&ailp->ail_lock);
>  
> @@ -546,7 +542,6 @@ xfsaild_push(
>  		ailp->ail_log_flush++;
>  
>  	if (!count || XFS_LSN_CMP(lsn, target) >= 0) {
> -out_done:
>  		/*
>  		 * We reached the target or the AIL is empty, so wait a bit
>  		 * longer for I/O to complete and remove pushed items from the
> @@ -638,7 +633,8 @@ xfsaild(
>  		 */
>  		smp_rmb();
>  		if (!xfs_ail_min(ailp) &&
> -		    ailp->ail_target == ailp->ail_target_prev) {
> +		    ailp->ail_target == ailp->ail_target_prev &&
> +		    list_empty(&ailp->ail_buf_list)) {
>  			spin_unlock(&ailp->ail_lock);
>  			freezable_schedule();
>  			tout = 0;
> -- 
> 2.21.3
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux