Re: xfs: pass total block res. as total xfs_bmapi_write() parameter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 15, 2017 at 02:24:14PM +0200, Nikolay Borisov wrote:
> From: Brian Foster <bfoster@xxxxxxxxxx>
> 
> The total field from struct xfs_alloc_arg is a bit of an unknown
> commodity. It is documented as the total block requirement for the
> transaction and is used in this manner from most call sites by virtue of
> passing the total block reservation of the transaction associated with
> an allocation. Several xfs_bmapi_write() callers pass hardcoded values
> of 0 or 1 for the total block requirement, which is a historical oddity
> without any clear reasoning.
> 
> The xfs_iomap_write_direct() caller, for example, passes 0 for the total
> block requirement. This has been determined to cause problems in the
> form of ABBA deadlocks of AGF buffers due to incorrect AG selection in
> the block allocator. Specifically, the xfs_alloc_space_available()
> function incorrectly selects an AG that doesn't actually have sufficient
> space for the allocation. This occurs because the args.total field is 0
> and thus the remaining free space check on the AG doesn't actually
> consider the size of the allocation request. This locks the AGF buffer,
> the allocation attempt proceeds and ultimately fails (in
> xfs_alloc_fix_minleft()), and xfs_alloc_vexent() moves on to the next
> AG. In turn, this can lead to incorrect AG locking order (if the
> allocator wraps around, attempting to lock AG 0 after acquiring AG N)
> and thus deadlock if racing with another operation. This problem has
> been reproduced via generic/299 on smallish (1GB) ramdisk test devices.
> 
> To avoid this problem, replace the undocumented hardcoded total
> parameters from the iomap and utility callers to pass the block
> reservation used for the associated transaction. This is consistent with
> other xfs_bmapi_write() callers throughout XFS. The assumption is that
> the total field allows the selection of an AG that can handle the entire
> operation rather than simply the allocation/range being requested (e.g.,
> resulting btree splits, etc.). This addresses the aforementioned
> generic/299 hang by ensuring AG selection only occurs when the
> allocation can be satisfied by the AG.
> 
> Reported-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx>
> Signed-off-by: Nikolay Borisov <nborisov@xxxxxxxx>
> ---
> 
> Here is the backport for 3.12 
> 

Seems fine to me, thanks. FWIW:

Acked-by: Brian Foster <bfoster@xxxxxxxxxx>

>  fs/xfs/xfs_bmap_util.c | 2 +-
>  fs/xfs/xfs_iomap.c     | 8 ++++----
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 2843a0426bca..66172581dacb 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -1146,7 +1146,7 @@ retry:
>  		xfs_bmap_init(&free_list, &firstfsb);
>  		error = xfs_bmapi_write(tp, ip, startoffset_fsb,
>  					allocatesize_fsb, alloc_type, &firstfsb,
> -					0, imapp, &nimaps, &free_list);
> +					resblks, imapp, &nimaps, &free_list);
>  		if (error) {
>  			goto error0;
>  		}
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 43e31b0b288b..da87415ac561 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -217,7 +217,7 @@ xfs_iomap_write_direct(
>  	xfs_bmap_init(&free_list, &firstfsb);
>  	nimaps = 1;
>  	error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb, bmapi_flag,
> -				&firstfsb, 0, imap, &nimaps, &free_list);
> +				&firstfsb, resblks, imap, &nimaps, &free_list);
>  	if (error)
>  		goto out_bmap_cancel;
>  
> @@ -762,7 +762,7 @@ xfs_iomap_write_allocate(
>  			error = xfs_bmapi_write(tp, ip, map_start_fsb,
>  						count_fsb,
>  						XFS_BMAPI_STACK_SWITCH,
> -						&first_block, 1,
> +						&first_block, nres,
>  						imap, &nimaps, &free_list);
>  			if (error)
>  				goto trans_cancel;
> @@ -877,8 +877,8 @@ xfs_iomap_write_unwritten(
>  		xfs_bmap_init(&free_list, &firstfsb);
>  		nimaps = 1;
>  		error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb,
> -				  XFS_BMAPI_CONVERT, &firstfsb,
> -				  1, &imap, &nimaps, &free_list);
> +					XFS_BMAPI_CONVERT, &firstfsb, resblks,
> +					&imap, &nimaps, &free_list);
>  		if (error)
>  			goto error_on_bmapi_transaction;
>  
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux