Re: [PATCH 10/10] xfs: retry COW fork delalloc conversion when no

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Fri, 15 Feb 2019 15:32:25 -0800

On Fri, Feb 15, 2019 at 03:47:25PM +0100, Christoph Hellwig wrote:
> While we can only truncate a block under the page lock for the current
> page, there is no high-level synchronization for moving extents from the
> COW to the data fork.  This means that for example we can have another
> thread doing a direct I/O completion that moves extents from the COW to
> the data fork race with writeback.  While this race is very hard to hit
> the always_cow seems to reproduce it reasonably well, and it also exists
> without that.  Because of that there is a chance that a delalloc
> conversion for the COW fork might not find any extents to convert.  In
> that case we should retry the whole block lookup and now find the blocks
> in the data fork.

<thinking aloud mode>

I /think/ the way that this series (+ Brian's before that) solve the
truncate/writeback race is that we now only convert existing delalloc
reservations to real extents when we're preparing to do writeback;
_writepage_map only cares about the mapping of the offset_fsb that it
happens to be looping right now (because the page lock serializes with
page cache truncate/punch); and we use the new sequence counters for
both the data and cow forks to decide when our cached mapping might be
invalid and therefore we need to get a new mapping?

Therefore we don't need to keep calling _trim_extent_eof or checking
offset against i_size or any of those other games because writeback
won't go allocating blocks into holes that are being punched and now we
have an explicit mechanism to invalidate wpc->imap instead of the
scattered detection code we had before?

If that's true, then:

Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>

--D
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> ---
>  fs/xfs/xfs_aops.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index a6abb7125203..2ed8733eca49 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -334,7 +334,8 @@ xfs_imap_valid(
>   * extent that maps offset_fsb in wpc->imap.
>   *
>   * The current page is held locked so nothing could have removed the block
> - * backing offset_fsb.
> + * backing offset_fsb, although it could have moved from the COW to the data
> + * fork by another thread.
>   */
>  static int
>  xfs_convert_blocks(
> @@ -375,6 +376,7 @@ xfs_map_blocks(
>  	xfs_fileoff_t		cow_fsb = NULLFILEOFF;
>  	struct xfs_bmbt_irec	imap;
>  	struct xfs_iext_cursor	icur;
> +	int			retries = 0;
>  	int			error = 0;
>  
>  	if (XFS_FORCED_SHUTDOWN(mp))
> @@ -404,6 +406,7 @@ xfs_map_blocks(
>  	 * into real extents.  If we return without a valid map, it means we
>  	 * landed in a hole and we skip the block.
>  	 */
> +retry:
>  	xfs_ilock(ip, XFS_ILOCK_SHARED);
>  	ASSERT(ip->i_d.di_format != XFS_DINODE_FMT_BTREE ||
>  	       (ip->i_df.if_flags & XFS_IFEXTENTS));
> @@ -471,8 +474,19 @@ xfs_map_blocks(
>  	return 0;
>  allocate_blocks:
>  	error = xfs_convert_blocks(wpc, ip, offset_fsb);
> -	if (error)
> +	if (error) {
> +		/*
> +		 * If we failed to find the extent in the COW fork we might have
> +		 * raced with a COW to data fork conversion or truncate.
> +		 * Restart the lookup to catch the extent in the data fork for
> +		 * the former case, but prevent additional retries to avoid
> +		 * looping forever for the latter case.
> +		 */
> +		if (error == -EAGAIN && wpc->fork == XFS_COW_FORK && !retries++)
> +			goto retry;
> +		ASSERT(error != -EAGAIN);
>  		return error;
> +	}
>  
>  	/*
>  	 * Due to merging the return real extent might be larger than the
> -- 
> 2.20.1
>