Re: [PATCH 1/2] xfs: reserve quota for dir expansion when linking/unlinking files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 09, 2022 at 11:22:26AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> XFS does not reserve quota for directory expansion when linking or
> unlinking children from a directory.  This means that we don't reject
> the expansion with EDQUOT when we're at or near a hard limit, which
> means that unprivileged userspace can use link()/unlink() to exceed
> quota.
> 
> The fix for this is nuanced -- link operations don't always expand the
> directory, and we allow a link to proceed with no space reservation if
> we don't need to add a block to the directory to handle the addition.
> Unlink operations generally do not expand the directory (you'd have to
> free a block and then cause a btree split) and we can defer the
> directory block freeing if there is no space reservation.
> 
> Moreover, there is a further bug in that we do not trigger the blockgc
> workers to try to clear space when we're out of quota.
> 
> To fix both cases, create a new xfs_trans_alloc_dir function that
> allocates the transaction, locks and joins the inodes, and reserves
> quota for the directory.  If there isn't sufficient space or quota,
> we'll switch the caller to reservationless mode.  This should prevent
> quota usage overruns with the least restriction in functionality.
> 
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_inode.c |   30 +++++-------------
>  fs/xfs/xfs_trans.c |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_trans.h |    3 ++
>  3 files changed, 97 insertions(+), 22 deletions(-)

Overall looks good, minor nits below:

> 
> 
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 04bf467b1090..a131bbfe74e4 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1217,7 +1217,7 @@ xfs_link(
>  {
>  	xfs_mount_t		*mp = tdp->i_mount;
>  	xfs_trans_t		*tp;
> -	int			error;
> +	int			error, space_error;
>  	int			resblks;
>  
>  	trace_xfs_link(tdp, target_name);
> @@ -1236,19 +1236,11 @@ xfs_link(
>  		goto std_return;
>  
>  	resblks = XFS_LINK_SPACE_RES(mp, target_name->len);
> -	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, resblks, 0, 0, &tp);
> -	if (error == -ENOSPC) {
> -		resblks = 0;
> -		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, 0, 0, 0, &tp);
> -	}
> +	error = xfs_trans_alloc_dir(tdp, &M_RES(mp)->tr_link, sip, &resblks,
> +			&tp, &space_error);

It's the nospace_error, isn't it? The code reads a lot better when
it's called that, too.


>  	if (error)
>  		goto std_return;
>  
> -	xfs_lock_two_inodes(sip, XFS_ILOCK_EXCL, tdp, XFS_ILOCK_EXCL);
> -
> -	xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
> -	xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
> -
>  	error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
>  			XFS_IEXT_DIR_MANIP_CNT(mp));
>  	if (error)
> @@ -1267,6 +1259,8 @@ xfs_link(
>  
>  	if (!resblks) {
>  		error = xfs_dir_canenter(tp, tdp, target_name);
> +		if (error == -ENOSPC && space_error)
> +			error = space_error;

This  would be better in the error_return stack, I think. That way
the transformation only has to be done once, and it will be done for
all functions that can potentially return ENOSPC.

>  		if (error)
>  			goto error_return;
>  	}
> @@ -2755,6 +2749,7 @@ xfs_remove(
>  	xfs_mount_t		*mp = dp->i_mount;
>  	xfs_trans_t             *tp = NULL;
>  	int			is_dir = S_ISDIR(VFS_I(ip)->i_mode);
> +	int			dontcare;
>  	int                     error = 0;
>  	uint			resblks;
>  
> @@ -2781,22 +2776,13 @@ xfs_remove(
>  	 * block from the directory.
>  	 */
>  	resblks = XFS_REMOVE_SPACE_RES(mp);
> -	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_remove, resblks, 0, 0, &tp);
> -	if (error == -ENOSPC) {
> -		resblks = 0;
> -		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_remove, 0, 0, 0,
> -				&tp);
> -	}
> +	error = xfs_trans_alloc_dir(dp, &M_RES(mp)->tr_remove, ip, &resblks,
> +			&tp, &dontcare);
>  	if (error) {
>  		ASSERT(error != -ENOSPC);
>  		goto std_return;
>  	}

So we just ignore -EDQUOT when it is returned in @dontcare? I'd like
a comment to explain why we don't care about EDQUOT here, because
the next time I look at this I will have forgotten all about this...

Cheers,

Dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux