Re: [PATCH 1/5] xfs: don't try to mark uncached buffers stale on error.

Ben Myers <bpm@xxxxxxx> · Tue, 24 Sep 2013 10:33:24 -0500

Hi Dave,

On Tue, Sep 24, 2013 at 04:01:12PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> fsstress failed during a shutdown with the following assert:
> 
> XFS: Assertion failed: xfs_buf_islocked(bp), file: fs/xfs/xfs_buf.c, line: 143
> .....
>  xfs_buf_stale+0x3f/0xf0
>  xfs_bioerror_relse+0x2d/0x90
>  xfsbdstrat+0x51/0xa0

Here you're showing an assert reported through an xfsbdstrat codepath...

>  xfs_zero_remaining_bytes+0x1d1/0x2d0
>  xfs_free_file_space+0x5d0/0x600
>  xfs_change_file_space+0x251/0x3a0
>  xfs_ioc_space+0xcc/0x130
> .....
> 
> xfs_zero_remaining_bytes() works with uncached buffers, and hence if
> we are preventing IO due to a shutdown, we should not be marking it
> stale as that is only for cached buffers. Instead, just mark it with
> an error and make sure it gets to the caller.
> 
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_buf.c | 31 +++++++++++++++----------------
>  1 file changed, 15 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 2634700..956685f 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1093,25 +1093,20 @@ xfs_bioerror_relse(
>  	struct xfs_buf	*bp)
>  {
>  	int64_t		fl = bp->b_flags;
> +
>  	/*
> -	 * No need to wait until the buffer is unpinned.
> -	 * We aren't flushing it.
> -	 *
> -	 * chunkhold expects B_DONE to be set, whether
> -	 * we actually finish the I/O or not. We don't want to
> -	 * change that interface.
> +	 * No need to wait until the buffer is unpinned. We aren't flushing it.
>  	 */
>  	XFS_BUF_UNREAD(bp);
>  	XFS_BUF_DONE(bp);
>  	xfs_buf_stale(bp);
>  	bp->b_iodone = NULL;
> +
> +	/*
> +	 * There's no reason to mark error for ASYNC buffers as there is no-one
> +	 * waiting to collect the error.
> +	 */
>  	if (!(fl & XBF_ASYNC)) {
> -		/*
> -		 * Mark b_error and B_ERROR _both_.
> -		 * Lot's of chunkcache code assumes that.
> -		 * There's no reason to mark error for
> -		 * ASYNC buffers.
> -		 */
>  		xfs_buf_ioerror(bp, EIO);
>  		complete(&bp->b_iowait);
>  	} else {
> @@ -1128,11 +1123,15 @@ xfs_bdstrat_cb(
>  	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
>  		trace_xfs_bdstrat_shut(bp, _RET_IP_);
>  		/*
> -		 * Metadata write that didn't get logged but
> -		 * written delayed anyway. These aren't associated
> -		 * with a transaction, and can be ignored.
> +		 * If this is a cached write, then it is likely to be a delayed
> +		 * write metadata buffer that can be ignored because the
> +		 * contents are logged. If it's an uncached buffer or a read
> +		 * operation, then the caller will get the error through the
> +		 * normal IO completion path. We can tell if the buffer is
> +		 * cached or not by looking to see if the b_pag field is NULL or
> +		 * not.
>  		 */
> -		if (!bp->b_iodone && !XFS_BUF_ISREAD(bp))
> +		if (!bp->b_iodone && !XFS_BUF_ISREAD(bp) && bp->b_pag)

...but it looks like your fix is in xfs_bdstrat_cb, which wouldn't have been
involved in the stack you posted above.  What am I missing?

Thanks,
	Ben

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs