Re: [RFC PATCH] xfs: borrow indirect blocks from freed extent when available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 23, 2014 at 03:28:58PM -0400, Brian Foster wrote:
> xfs_bmap_del_extent() handles extent removal from the in-core and
> on-disk extent lists. When removing a delalloc range, it updates the
> indirect block reservation appropriately based on the removal. It
> currently enforces that the new indirect block reservation is less than
> or equal to the original. This is normally the case in all situations
> except for when the removed range creates a hole in a single delalloc
> extent, thus splitting a single delalloc extent in two.
> 
> The indirect block reservation is divided evenly between the two new
> extents in this scenario. However, it is possible with small enough
> extents to split an indlen==1 extent into two such slightly smaller
> extents. This leaves one extent with 0 indirect blocks and leads to
> assert failures in other areas (e.g., xfs_bunmapi() if the extent
> happens to be removed).

I had long suspected we had an issue in this code, but was never
able to nail down a reproducer that triggered it. Do you have a
reproducer, or did you find this by reading/tracing the code?

> Refactor xfs_bunmapi() to make the updates that must be consistent
> against the inode (e.g., delalloc block counter, quota reservation)
> right before the extent is deleted. Move the sb block counter update
> after the extent is deleted and update xfs_bmap_del_extent() to steal
> some blocks from the freed extent if a larger overall indirect
> reservation is required by the extent removal.
> 
> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> ---
> 
> Hi all,
> 
> I'm seeing the following assert more frequently with fsx and the recent
> xfs_free_file_space() changes (at least on 1k fsb fs'):
> 
> XFS: Assertion failed: startblockval(del.br_startblock) > 0, file: fs/xfs/libxfs/xfs_bmap.c, line: 5281
> 
> This occurs for the reason described in the commit log description. This
> is a quick experiment I wanted to test to verify the problem goes away
> (so far, so good). Very lightly tested so far.

I suspect it's also the cause of these occasional assert failures
that I see:

XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, file: fs/xfs/xfs_trans.c, line: 327

during delalloc conversion because there wasn't a space reservation
for the blocks allocated (i.e. indlen was zero) and so we overrun
the transaction block reservation.

> I'm not too fond of changing br_blockcount like this. It seems like a
> potential landmine. Alternative approaches could be to kill the assert
> if we think indlen==0 extents is not a huge problem in this scenario,

A delalloc extent always needs a reservation....

> update the counters independently in xfs_bmap_del_extent() to get the
> needed blocks or pass a separate output parameter rather than messing
> with br_blockcount (e.g., '*ofreedblks'). The latter might mean we could
> just move the entire hunk that updates the inode/quota and whatnot
> rather than splitting it up.
> 
> I wanted to put this on the list for comments. Thoughts? Other ideas?
> Thanks.
> 
> Brian
> 
>  fs/xfs/libxfs/xfs_bmap.c | 64 ++++++++++++++++++++++++++++++------------------
>  1 file changed, 40 insertions(+), 24 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 86df952..1858e6b 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -4969,6 +4969,11 @@ xfs_bmap_del_extent(
>  			temp2 = xfs_bmap_worst_indlen(ip, temp2);
>  			new.br_startblock = nullstartblock((int)temp2);
>  			da_new = temp + temp2;
> +			/* pull blocks from extent to make up the difference */
> +			while (da_new > da_old && del->br_blockcount) {
> +				del->br_blockcount--;
> +				da_new--;
> +			}
>  			while (da_new > da_old) {
>  				if (temp) {
>  					temp--;

So this is the crux of the fix, right? The block stealing from the
deleted extent? I have no objections to doing this given that we
already munge the indirect reservations inteh loop below, but
I think we should make this cleaner and more understandable. i.e.
refactor the indlen adjustment code into a helper, and integrate the
block stealing into the existing adjustment loop?

> @@ -5277,9 +5282,37 @@ xfs_bunmapi(
>  				goto nodelete;
>  			}
>  		}
> +
> +		/*
> +		 * If it's the case where the directory code is running
> +		 * with no block reservation, and the deleted block is in
> +		 * the middle of its extent, and the resulting insert
> +		 * of an extent would cause transformation to btree format,
> +		 * then reject it.  The calling code will then swap
> +		 * blocks around instead.
> +		 * We have to do this now, rather than waiting for the
> +		 * conversion to btree format, since the transaction
> +		 * will be dirty.
> +		 */
> +		if (!wasdel && xfs_trans_get_block_res(tp) == 0 &&
> +		    XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS &&
> +		    XFS_IFORK_NEXTENTS(ip, whichfork) >= /* Note the >= */
> +			XFS_IFORK_MAXEXT(ip, whichfork) &&
> +		    del.br_startoff > got.br_startoff &&
> +		    del.br_startoff + del.br_blockcount <
> +		    got.br_startoff + got.br_blockcount) {
> +			error = -ENOSPC;
> +			goto error0;
> +		}
> +
> +		/*
> +		 * Unreserve quota and update realtime free space, if
> +		 * appropriate. If delayed allocation, update the inode delalloc
> +		 * counter now and wait to update the sb counters as
> +		 * xfs_bmap_del_extent() might need to borrow some blocks.
> +		 */
>  		if (wasdel) {
>  			ASSERT(startblockval(del.br_startblock) > 0);
> -			/* Update realtime/data freespace, unreserve quota */
>  			if (isrt) {
>  				xfs_filblks_t rtexts;
>  

Perhaps this should be separated into it's own patch that is applied
first? It doesn't seem to be directly related to the accounting fix...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs




[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux