Re: [PATCH] xfs: check return codes when flushing block devices

Dave Chinner <david@xxxxxxxxxxxxx> · Mon, 1 Aug 2022 10:06:41 +1000

On Sun, Jul 31, 2022 at 09:22:28AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> If a block device cache flush fails, fsync needs to report that to upper
> levels.  If the log can't flush the data device, we should shut it down
> immediately because we've just violated an invariant.  Hence, check the
> return value of blkdev_issue_flush.
> 
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_file.c |   15 ++++++++++-----
>  fs/xfs/xfs_log.c  |    7 +++++--
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 5a171c0b244b..88450c33ab01 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -163,9 +163,11 @@ xfs_file_fsync(
>  	 * inode size in case of an extending write.
>  	 */
>  	if (XFS_IS_REALTIME_INODE(ip))
> -		blkdev_issue_flush(mp->m_rtdev_targp->bt_bdev);
> +		error = blkdev_issue_flush(mp->m_rtdev_targp->bt_bdev);
>  	else if (mp->m_logdev_targp != mp->m_ddev_targp)
> -		blkdev_issue_flush(mp->m_ddev_targp->bt_bdev);
> +		error = blkdev_issue_flush(mp->m_ddev_targp->bt_bdev);
> +	if (error)
> +		return error;
>  
>  	/*
>  	 * Any inode that has dirty modifications in the log is pinned.  The
> @@ -173,8 +175,11 @@ xfs_file_fsync(
>  	 * that happen concurrently to the fsync call, but fsync semantics
>  	 * only require to sync previously completed I/O.
>  	 */
> -	if (xfs_ipincount(ip))
> +	if (xfs_ipincount(ip)) {
>  		error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
> +		if (error)
> +			return error;
> +	}

Shouldn't we still try to flush the data device if necessary, even
if the log flush failed?

>  	/*
>  	 * If we only have a single device, and the log force about was
> @@ -185,9 +190,9 @@ xfs_file_fsync(
>  	 */
>  	if (!log_flushed && !XFS_IS_REALTIME_INODE(ip) &&
>  	    mp->m_logdev_targp == mp->m_ddev_targp)
> -		blkdev_issue_flush(mp->m_ddev_targp->bt_bdev);
> +		return blkdev_issue_flush(mp->m_ddev_targp->bt_bdev);
>  
> -	return error;
> +	return 0;
>  }
>  
>  static int
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 4b1c0a9c6368..8a767f4145f0 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1926,8 +1926,11 @@ xlog_write_iclog(
>  		 * by the LSN in this iclog is on stable storage. This is slow,
>  		 * but it *must* complete before we issue the external log IO.
>  		 */
> -		if (log->l_targ != log->l_mp->m_ddev_targp)
> -			blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev);
> +		if (log->l_targ != log->l_mp->m_ddev_targp &&
> +		    blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) {
> +			xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> +			return;
> +		}

That seems pretty drastic, though I'm not sure what else apart from
ignoring the data device flush error can be done here. Also, it's
not actually a log IO error - it's a data device IO error so it's a
really a metadata writeback problem. Hence the use of
SHUTDOWN_LOG_IO_ERROR probably needs a comment to explain why it
needs to be used here...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx