Re: [PATCH v2] xfs: up(ic_sema) if flushing data device fails

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 26 Oct 2023 11:41:03 +1100

On Wed, Oct 25, 2023 at 03:58:48PM -0700, Leah Rumancik wrote:
> We flush the data device cache before we issue external log IO. If
> the flush fails, we shut down the log immediately and return. However,
> the iclog->ic_sema is left in a decremented state so let's add an up().
> Prior to this patch, xfs/438 would fail consistently when running with
> an external log device:
> 
> sync
>   -> xfs_log_force
>   -> xlog_write_iclog
>       -> down(&iclog->ic_sema)
>       -> blkdev_issue_flush (fail causes us to intiate shutdown)
>           -> xlog_force_shutdown
>           -> return
> 
> unmount
>   -> xfs_log_umount
>       -> xlog_wait_iclog_completion
>           -> down(&iclog->ic_sema) --------> HANG
> 
> There is a second early return / shutdown. Make sure the up() happens
> for it as well.
> 
> Fixes: b5d721eaae47 ("xfs: external logs need to flush data device")
> Fixes: 842a42d126b4 ("xfs: shutdown on failure to add page to log bio")
> Fixes: 7d839e325af2 ("xfs: check return codes when flushing block devices")
> Signed-off-by: Leah Rumancik <leah.rumancik@xxxxxxxxx>
> Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
> 
> Notes:
>     v1->v2:
>      - use goto instead of multiple returns
>      - add Fixes tags
> 
>  fs/xfs/xfs_log.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 51c100c86177..8ae923e00eb6 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1925,20 +1925,17 @@ xlog_write_iclog(
>  		 * avoid shutdown re-entering this path and erroring out again.
>  		 */
>  		if (log->l_targ != log->l_mp->m_ddev_targp &&
> -		    blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) {
> -			xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> -			return;
> -		}
> +		    blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev))
> +			goto shutdown;
>  	}
>  	if (iclog->ic_flags & XLOG_ICL_NEED_FUA)
>  		iclog->ic_bio.bi_opf |= REQ_FUA;
>  
>  	iclog->ic_flags &= ~(XLOG_ICL_NEED_FLUSH | XLOG_ICL_NEED_FUA);
>  
> -	if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count)) {
> -		xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> -		return;
> -	}
> +	if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count))
> +		goto shutdown;
> +
>  	if (is_vmalloc_addr(iclog->ic_data))
>  		flush_kernel_vmap_range(iclog->ic_data, count);
>  
> @@ -1959,6 +1956,10 @@ xlog_write_iclog(
>  	}
>  
>  	submit_bio(&iclog->ic_bio);
> +	return;
> +shutdown:
> +	up(&iclog->ic_sema);
> +	xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);

Hmmmm. So we'll leave an unreferenced, unlocked iclog in a SYNCING
state where something else can access it, then hope that the
shutdown gets to it first and that it cleans it all up? That doesn't
seem completely safe to me.

At entry to this function, if the log is already shut down, it runs
xlog_state_done_syncing() to force the iclog and it's attached state
to be cleaned up before dropping the lock and returning.

If I look at xlog_ioend_work(), if it gets an error it does the
shutdown, then calls xlog_state_done_syncing() to clean up the iclog
and attached state, then drops the lock.

Hence it appears to me that the error handling for a fatal errors in
IO submission should match the io completion error handling. i.e the
error stack for this function should look like this:

....
	if (xlog_is_shutdown(log))
		goto out_done_sync;
....
	if (error)
		goto out_do_shutdown;
....
out_do_shutdown:
	xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
out_done_sync:
	xlog_state_done_syncing(iclog);
	up(&iclog->ic_sema);
}

Thoughts?

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx