Re: [PATCH v2] xfs: fix iclog release error check race with shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 18, 2020 at 12:54:25PM -0500, Brian Foster wrote:
> Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with
> l_icloglock held"), xlog_state_release_iclog() always performed a
> locked check of the iclog error state before proceeding into the
> sync state processing code. As of this commit, part of
> xlog_state_release_iclog() was open-coded into
> xfs_log_release_iclog() and as a result the locked error state check
> was lost.
> 
> The lockless check still exists, but this doesn't account for the
> possibility of a race with a shutdown being performed by another
> task causing the iclog state to change while the original task waits
> on ->l_icloglock. This has reproduced very rarely via generic/475
> and manifests as an assert failure in __xlog_state_release_iclog()
> due to an unexpected iclog state.
> 
> Restore the locked error state check in xlog_state_release_iclog()
> to ensure that an iclog state update via shutdown doesn't race with
> the iclog release state processing code.
> 
> Fixes: df732b29c807 ("xfs: call xlog_state_release_iclog with l_icloglock held")
> Reported-by: Zorro Lang <zlang@xxxxxxxxxx>
> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> ---
> 
> v2:
> - Include Fixes tag.
> - Use common error path to include shutdown call.
> v1: https://lore.kernel.org/linux-xfs/20200214181528.24046-1-bfoster@xxxxxxxxxx/
> 
>  fs/xfs/xfs_log.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index f6006d94a581..796ff37d5bb5 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -605,18 +605,23 @@ xfs_log_release_iclog(
>  	struct xlog		*log = mp->m_log;
>  	bool			sync;
>  
> -	if (iclog->ic_state == XLOG_STATE_IOERROR) {
> -		xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR);
> -		return -EIO;
> -	}

hmmmm.

xfs_force_shutdown() will does nothing if the iclog at the head of
the log->iclog list is marked with XLOG_STATE_IOERROR before IO is
submitted. In general, that is the case here as the head iclog is
what xlog_state_get_iclog_space() returns.

i.e. XLOG_STATE_IOERROR here implies the log has already been shut
down because the state is IOERROR rather than ACTIVE or WANT_SYNC as
was set when the iclog was obtained from
xlog_state_get_iclog_space().

> +	if (iclog->ic_state == XLOG_STATE_IOERROR)
> +		goto error;
>  
>  	if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) {
> +		if (iclog->ic_state == XLOG_STATE_IOERROR) {
> +			spin_unlock(&log->l_icloglock);
> +			goto error;
> +		}
>  		sync = __xlog_state_release_iclog(log, iclog);
>  		spin_unlock(&log->l_icloglock);
>  		if (sync)
>  			xlog_sync(log, iclog);
>  	}
>  	return 0;
> +error:
> +	xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR);
> +	return -EIO;

Hence I suspect that this should be ASSERT(XLOG_FORCED_SHUTDOWN(log))
just like is in xfs_log_force_umount() when this pre-existing log
IO error condition is tripped over...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux