Re: [PATCH v9 22/24] xfs: Add delay ready attr remove routines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 30, 2020 at 03:50:14PM -0700, Allison Collins wrote:
> This patch modifies the attr remove routines to be delay ready. This
> means they no longer roll or commit transactions, but instead return
> -EAGAIN to have the calling routine roll and refresh the transaction. In
> this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> uses a sort of state machine like switch to keep track of where it was
> when EAGAIN was returned. xfs_attr_node_removename has also been
> modified to use the switch, and a new version of xfs_attr_remove_args
> consists of a simple loop to refresh the transaction until the operation
> is completed.  A new XFS_DAC_DEFER_FINISH flag is used to finish the
> transaction where ever the existing code used to.
> 
> Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> version __xfs_attr_rmtval_remove. We will rename
> __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> done.
> 
> xfs_attr_rmtval_remove itself is still in use by the set routines (used
> during a rename).  For reasons of perserving existing function, we
> modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> used and will be removed.
> 
> This patch also adds a new struct xfs_delattr_context, which we will use
> to keep track of the current state of an attribute operation. The new
> xfs_delattr_state enum is used to track various operations that are in
> progress so that we know not to repeat them, and resume where we left
> off before EAGAIN was returned to cycle out the transaction. Other
> members take the place of local variables that need to retain their
> values across multiple function recalls.
> 
> Below is a state machine diagram for attr remove operations. The
> XFS_DAS_* states indicate places where the function would return
> -EAGAIN, and then immediately resume from after being recalled by the
> calling function.  States marked as a "subroutine state" indicate that
> they belong to a subroutine, and so the calling function needs to pass
> them back to that subroutine to allow it to finish where it left off.
> But they otherwise do not have a role in the calling function other than
> just passing through.
> 
>  xfs_attr_remove_iter()
>          XFS_DAS_RM_SHRINK     ─┐
>          (subroutine state)     │
>                                 └─>xfs_attr_node_removename()
>
>                                                  v
>                                               need to
>                                            shrink tree? ─n─┐
>                                                  │         │
>                                                  y         │
>                                                  │         │
>                                                  v         │
>                                          XFS_DAS_RM_SHRINK │
>                                                  │         │
>                                                  v         │
>                                                 done <─────┘
> 
> Signed-off-by: Allison Collins <allison.henderson@xxxxxxxxxx>
> ---
>  fs/xfs/libxfs/xfs_attr.c        | 159 ++++++++++++++++++++++++++++++----------
>  fs/xfs/libxfs/xfs_attr.h        |  73 ++++++++++++++++++
>  fs/xfs/libxfs/xfs_attr_leaf.c   |   2 +-
>  fs/xfs/libxfs/xfs_attr_remote.c |  31 ++++----
>  fs/xfs/libxfs/xfs_attr_remote.h |   2 +-
>  fs/xfs/xfs_attr_inactive.c      |   2 +-
>  6 files changed, 215 insertions(+), 54 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index c8cae68..7213589 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -53,12 +53,21 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>   */
>  STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>  STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
> -STATIC int xfs_attr_node_removename(xfs_da_args_t *args);
> +STATIC int xfs_attr_node_removename(struct xfs_delattr_context *dac);
>  STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>  				 struct xfs_da_state **state);
>  STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>  STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>  
> +void
> +xfs_delattr_context_init(
> +	struct xfs_delattr_context	*dac,
> +	struct xfs_da_args		*args)
> +{
> +	memset(dac, 0, sizeof(struct xfs_delattr_context));
> +	dac->da_args = args;

Couldn't this could be open coded as:

struct xfs_delattr_context	dac = {
	.da_args		= args,
};

in the callers?  One less helper function, and it means that the stack
contents will always be fully initialized.

> +}
> +
>  int
>  xfs_inode_hasattr(
>  	struct xfs_inode	*ip)
> @@ -263,6 +272,18 @@ xfs_attr_set_shortform(
>  	return 0;
>  }
>  
> +int xfs_attr_defer_finish(

static inline int
xfs_attr_defer_finish(

> +	struct xfs_delattr_context      *dac)
> +{
> +	struct xfs_da_args              *args = dac->da_args;
> +
> +	if (!(dac->flags & XFS_DAC_DEFER_FINISH))
> +		return 0;
> +
> +	dac->flags &= ~XFS_DAC_DEFER_FINISH;
> +	return xfs_defer_finish(&args->trans);

I also wonder if you want to hoist the trans_roll_inode to here too?

Actually, what if this instead became:

static inline bool
xfs_attr_roll_again(
	struct xfs_delattr_context	*dac,
	int				*error)
{
	if (*error != -EAGAIN)
		return false;

	if (dac->flags & XFS_DAC_DEFER_FINISH)) {
		dac->flags &= ~XFS_DAC_DEFER_FINISH;
		*error = xfs_defer_finish(&args->trans);
		if (*error)
			return false;
	}

	*error = xfs_trans_roll_inode(&dac->da_args->trans, dac->da_args->dp);
	return *error == 0;
}

> +}
> +
>  /*
>   * Set the attribute specified in @args.
>   */
> @@ -363,23 +384,57 @@ xfs_has_attr(
>   */
>  int
>  xfs_attr_remove_args(
> -	struct xfs_da_args      *args)
> +	struct xfs_da_args	*args)
> +{
> +	int			error = 0;
> +	struct			xfs_delattr_context dac;
> +
> +	xfs_delattr_context_init(&dac, args);
> +
> +	do {
> +		error = xfs_attr_remove_iter(&dac);
> +		if (error != -EAGAIN)
> +			break;
> +
> +		error = xfs_attr_defer_finish(&dac);
> +		if (error)
> +			break;
> +
> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> +		if (error)
> +			break;
> +	} while (true);

and then this whole function becomes:

int
xfs_attr_remove_args(
	struct xfs_da_args		*args)
{
	struct xfs_delattr_context	dac = {
		.da_args		= args,
	};

	do {
		error = xfs_attr_remove_iter(&dac);
	} while (xfs_attr_roll_again(&dac, &error));

	return error;
}

> +
> +	return error;
> +}
> +
> +/*
> + * Remove the attribute specified in @args.
> + *
> + * This function may return -EAGAIN to signal that the transaction needs to be
> + * rolled.  Callers should continue calling this function until they receive a
> + * return value other than -EAGAIN.
> + */
> +int
> +xfs_attr_remove_iter(
> +	struct xfs_delattr_context *dac)
>  {
> +	struct xfs_da_args	*args = dac->da_args;
>  	struct xfs_inode	*dp = args->dp;
> -	int			error;
> +
> +	if (dac->dela_state == XFS_DAS_RM_SHRINK)
> +		goto node;
>  
>  	if (!xfs_inode_hasattr(dp)) {
> -		error = -ENOATTR;
> +		return -ENOATTR;
>  	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
>  		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
> -		error = xfs_attr_shortform_remove(args);
> +		return xfs_attr_shortform_remove(args);
>  	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> -		error = xfs_attr_leaf_removename(args);
> -	} else {
> -		error = xfs_attr_node_removename(args);
> +		return xfs_attr_leaf_removename(args);
>  	}
> -
> -	return error;
> +node:
> +	return  xfs_attr_node_removename(dac);
>  }
>  
>  /*
> @@ -1177,15 +1232,17 @@ xfs_attr_leaf_mark_incomplete(
>  
>  /*
>   * Initial setup for xfs_attr_node_removename.  Make sure the attr is there and
> - * the blocks are valid.  Any remote blocks will be marked incomplete.
> + * the blocks are valid.  Any remote blocks will be marked incomplete and
> + * invalidated.
>   */
>  STATIC
>  int xfs_attr_node_removename_setup(
> -	struct xfs_da_args	*args,
> -	struct xfs_da_state	**state)
> +	struct xfs_delattr_context	*dac,
> +	struct xfs_da_state		**state)
>  {
> -	int			error;
> -	struct xfs_da_state_blk	*blk;
> +	struct xfs_da_args		*args = dac->da_args;
> +	int				error;
> +	struct xfs_da_state_blk		*blk;
>  
>  	error = xfs_attr_node_hasname(args, state);
>  	if (error != -EEXIST)
> @@ -1195,6 +1252,13 @@ int xfs_attr_node_removename_setup(
>  	ASSERT(blk->bp != NULL);
>  	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
>  
> +	/*
> +	 * Store blk and state in the context incase we need to cycle out the
> +	 * transaction
> +	 */
> +	dac->blk = blk;
> +	dac->da_state = *state;
> +
>  	if (args->rmtblkno > 0) {
>  		error = xfs_attr_leaf_mark_incomplete(args, *state);
>  		if (error)
> @@ -1210,12 +1274,15 @@ int xfs_attr_node_removename_setup(
>  
>  STATIC int
>  xfs_attr_node_removename_rmt (
> -	struct xfs_da_args	*args,
> -	struct xfs_da_state	*state)
> +	struct xfs_delattr_context	*dac,
> +	struct xfs_da_state		*state)
>  {
> -	int			error = 0;
> +	int				error = 0;
>  
> -	error = xfs_attr_rmtval_remove(args);
> +	/*
> +	 * May return -EAGAIN to request that the caller recall this function
> +	 */
> +	error = __xfs_attr_rmtval_remove(dac);
>  	if (error)
>  		return error;
>  
> @@ -1232,21 +1299,35 @@ xfs_attr_node_removename_rmt (
>   * This will involve walking down the Btree, and may involve joining
>   * leaf nodes and even joining intermediate nodes up to and including
>   * the root node (a special case of an intermediate node).
> + *
> + * This routine is meant to function as either an inline or delayed operation,
> + * and may return -EAGAIN when the transaction needs to be rolled.  Calling
> + * functions will need to handle this, and recall the function until a
> + * successful error code is returned.
>   */
>  STATIC int
>  xfs_attr_node_removename(
> -	struct xfs_da_args	*args)
> +	struct xfs_delattr_context	*dac)
>  {
> -	struct xfs_da_state	*state;
> -	struct xfs_da_state_blk	*blk;
> -	int			retval, error;
> -	struct xfs_inode	*dp = args->dp;
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_state		*state;
> +	struct xfs_da_state_blk		*blk;
> +	int				retval, error;
> +	struct xfs_inode		*dp = args->dp;
>  
>  	trace_xfs_attr_node_removename(args);
> +	state = dac->da_state;
> +	blk = dac->blk;
>  
> -	error = xfs_attr_node_removename_setup(args, &state);
> -	if (error)
> -		goto out;
> +	if (dac->dela_state == XFS_DAS_RM_SHRINK)
> +		goto das_rm_shrink;
> +
> +	if ((dac->flags & XFS_DAC_NODE_RMVNAME_INIT) == 0) {
> +		dac->flags |= XFS_DAC_NODE_RMVNAME_INIT;
> +		error = xfs_attr_node_removename_setup(dac, &state);

Shouldn't XFS_DAC_NODE_RMVNAME_INIT be set by
xfs_attr_node_removename_setup?

> +		if (error)
> +			goto out;
> +	}
>  
>  	/*
>  	 * If there is an out-of-line value, de-allocate the blocks.
> @@ -1254,8 +1335,13 @@ xfs_attr_node_removename(
>  	 * overflow the maximum size of a transaction and/or hit a deadlock.
>  	 */
>  	if (args->rmtblkno > 0) {
> -		error = xfs_attr_node_removename_rmt(args, state);
> -		if (error)
> +		/*
> +		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
> +		 */
> +		error = xfs_attr_node_removename_rmt(dac, state);
> +		if (error == -EAGAIN)
> +			return error;
> +		else if (error)
>  			goto out;
>  	}
>  
> @@ -1274,17 +1360,14 @@ xfs_attr_node_removename(
>  		error = xfs_da3_join(state);
>  		if (error)
>  			goto out;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			goto out;
> -		/*
> -		 * Commit the Btree join operation and start a new trans.
> -		 */
> -		error = xfs_trans_roll_inode(&args->trans, dp);
> -		if (error)
> -			goto out;
> +
> +		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		dac->dela_state = XFS_DAS_RM_SHRINK;
> +		return -EAGAIN;
>  	}
>  
> +das_rm_shrink:
> +
>  	/*
>  	 * If the result is small enough, push it all into the inode.
>  	 */
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 66575b8..0430c79 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -74,6 +74,75 @@ struct xfs_attr_list_context {
>  };
>  
>  
> +/*
> + * ========================================================================
> + * Structure used to pass context around among the delayed routines.
> + * ========================================================================
> + */
> +
> +/*
> + * Below is a state machine diagram for attr remove operations. The  XFS_DAS_*
> + * states indicate places where the function would return -EAGAIN, and then
> + * immediately resume from after being recalled by the calling function. States
> + * marked as a "subroutine state" indicate that they belong to a subroutine, and
> + * so the calling function needs to pass them back to that subroutine to allow
> + * it to finish where it left off. But they otherwise do not have a role in the
> + * calling function other than just passing through.
> + *
> + * xfs_attr_remove_iter()
> + *	  XFS_DAS_RM_SHRINK ─┐
> + *	  (subroutine state) │
> + *	                     └─>xfs_attr_node_removename()
> + *	                                      │
> + *	                                      v
> + *	                                   need to
> + *	                                shrink tree? ─n─┐
> + *	                                      │         │
> + *	                                      y         │
> + *	                                      │         │
> + *	                                      v         │
> + *	                              XFS_DAS_RM_SHRINK │
> + *	                                      │         │
> + *	                                      v         │
> + *	                                     done <─────┘
> + *
> + */
> +
> +/*
> + * Enum values for xfs_delattr_context.da_state
> + *
> + * These values are used by delayed attribute operations to keep track  of where
> + * they were before they returned -EAGAIN.  A return code of -EAGAIN signals the
> + * calling function to roll the transaction, and then recall the subroutine to
> + * finish the operation.  The enum is then used by the subroutine to jump back
> + * to where it was and resume executing where it left off.
> + */
> +enum xfs_delattr_state {
> +				      /* Zero is uninitalized */
> +	XFS_DAS_RM_SHRINK	= 1,  /* We are shrinking the tree */
> +};
> +
> +/*
> + * Defines for xfs_delattr_context.flags
> + */
> +#define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
> +#define XFS_DAC_NODE_RMVNAME_INIT	0x02 /* xfs_attr_node_removename init */
> +
> +/*
> + * Context used for keeping track of delayed attribute operations
> + */
> +struct xfs_delattr_context {
> +	struct xfs_da_args      *da_args;
> +
> +	/* Used in xfs_attr_node_removename to roll through removing blocks */
> +	struct xfs_da_state     *da_state;
> +	struct xfs_da_state_blk *blk;
> +
> +	/* Used to keep track of current state of delayed operation */
> +	unsigned int            flags;
> +	enum xfs_delattr_state  dela_state;
> +};
> +
>  /*========================================================================
>   * Function prototypes for the kernel.
>   *========================================================================*/
> @@ -91,6 +160,10 @@ int xfs_attr_set(struct xfs_da_args *args);
>  int xfs_attr_set_args(struct xfs_da_args *args);
>  int xfs_has_attr(struct xfs_da_args *args);
>  int xfs_attr_remove_args(struct xfs_da_args *args);
> +int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
> +int xfs_attr_defer_finish(struct xfs_delattr_context *dac);
>  bool xfs_attr_namecheck(const void *name, size_t length);
> +void xfs_delattr_context_init(struct xfs_delattr_context *dac,
> +			      struct xfs_da_args *args);
>  
>  #endif	/* __XFS_ATTR_H__ */
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> index f55402b..2e36c8b 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> @@ -19,8 +19,8 @@
>  #include "xfs_bmap_btree.h"
>  #include "xfs_bmap.h"
>  #include "xfs_attr_sf.h"
> -#include "xfs_attr_remote.h"
>  #include "xfs_attr.h"
> +#include "xfs_attr_remote.h"
>  #include "xfs_attr_leaf.h"
>  #include "xfs_error.h"
>  #include "xfs_trace.h"
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> index f770159..f2d46c7 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> @@ -676,14 +676,16 @@ xfs_attr_rmtval_invalidate(
>   */
>  int
>  xfs_attr_rmtval_remove(
> -	struct xfs_da_args      *args)
> +	struct xfs_da_args		*args)
>  {
> -	xfs_dablk_t		lblkno;
> -	int			blkcnt;
> -	int			error = 0;
> -	int			retval = 0;
> +	struct xfs_delattr_context	dac;
> +	xfs_dablk_t			lblkno;
> +	int				blkcnt;
> +	int				error = 0;
> +	int				retval = 0;
>  
>  	trace_xfs_attr_rmtval_remove(args);
> +	xfs_delattr_context_init(&dac, args);
>  
>  	/*
>  	 * Keep de-allocating extents until the remote-value region is gone.
> @@ -691,10 +693,14 @@ xfs_attr_rmtval_remove(
>  	lblkno = args->rmtblkno;
>  	blkcnt = args->rmtblkcnt;
>  	do {
> -		retval = __xfs_attr_rmtval_remove(args);
> +		retval = __xfs_attr_rmtval_remove(&dac);
>  		if (retval && retval != EAGAIN)
>  			return retval;
>  
> +		error = xfs_attr_defer_finish(&dac);
> +		if (error)
> +			break;
> +
>  		/*
>  		 * Close out trans and start the next one in the chain.
>  		 */
> @@ -713,9 +719,10 @@ xfs_attr_rmtval_remove(
>   */
>  int
>  __xfs_attr_rmtval_remove(
> -	struct xfs_da_args	*args)
> +	struct xfs_delattr_context	*dac)
>  {
> -	int			error, done;
> +	struct xfs_da_args		*args = dac->da_args;
> +	int				error, done;
>  
>  	/*
>  	 * Unmap value blocks for this attr.
> @@ -725,12 +732,10 @@ __xfs_attr_rmtval_remove(
>  	if (error)
>  		return error;
>  
> -	error = xfs_defer_finish(&args->trans);
> -	if (error)
> -		return error;
> -
> -	if (!done)
> +	if (!done) {
> +		dac->flags &= ~XFS_DAC_DEFER_FINISH;

Why do we strip off DEFER_FINISH here?  Is it because we're making
xfs_attr_rmtval_remove invoke xfs_defer_finish unconditionally to clear
out any pending rmap items?

If you transformed the do-while loop in xfs_attr_rmtval_remove to use
xfs_attr_roll_again, then you could make this set DEFER_FINISH, which
would make me a lot less suspicious that we're dropping state here.

(It would also clean up a lot of the defer_finish -> trans_roll_inode
code that's scattering everywhere...)

--D

>  		return -EAGAIN;
> +	}
>  
>  	return error;
>  }
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
> index ee3337b..351da00 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.h
> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
> @@ -14,5 +14,5 @@ int xfs_attr_rmtval_remove(struct xfs_da_args *args);
>  int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
>  		xfs_buf_flags_t incore_flags);
>  int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
> -int __xfs_attr_rmtval_remove(struct xfs_da_args *args);
> +int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
>  #endif /* __XFS_ATTR_REMOTE_H__ */
> diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> index c42f90e..b2150fa 100644
> --- a/fs/xfs/xfs_attr_inactive.c
> +++ b/fs/xfs/xfs_attr_inactive.c
> @@ -15,10 +15,10 @@
>  #include "xfs_da_format.h"
>  #include "xfs_da_btree.h"
>  #include "xfs_inode.h"
> +#include "xfs_attr.h"
>  #include "xfs_attr_remote.h"
>  #include "xfs_trans.h"
>  #include "xfs_bmap.h"
> -#include "xfs_attr.h"
>  #include "xfs_attr_leaf.h"
>  #include "xfs_quota.h"
>  #include "xfs_dir2.h"
> -- 
> 2.7.4
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux