On Fri, Apr 02, 2021 at 02:42:28AM -0700, Allison Henderson wrote: > > > On 4/1/21 9:55 AM, Brian Foster wrote: > > On Thu, Mar 25, 2021 at 05:33:07PM -0700, Allison Henderson wrote: > > > This patch modifies the attr remove routines to be delay ready. This > > > means they no longer roll or commit transactions, but instead return > > > -EAGAIN to have the calling routine roll and refresh the transaction. In > > > this series, xfs_attr_remove_args is merged with > > > xfs_attr_node_removename become a new function, xfs_attr_remove_iter. > > > This new version uses a sort of state machine like switch to keep track > > > of where it was when EAGAIN was returned. A new version of > > > xfs_attr_remove_args consists of a simple loop to refresh the > > > transaction until the operation is completed. A new XFS_DAC_DEFER_FINISH > > > flag is used to finish the transaction where ever the existing code used > > > to. > > > > > > Calls to xfs_attr_rmtval_remove are replaced with the delay ready > > > version __xfs_attr_rmtval_remove. We will rename > > > __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are > > > done. > > > > > > xfs_attr_rmtval_remove itself is still in use by the set routines (used > > > during a rename). For reasons of preserving existing function, we > > > modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is > > > set. Similar to how xfs_attr_remove_args does here. Once we transition > > > the set routines to be delay ready, xfs_attr_rmtval_remove is no longer > > > used and will be removed. > > > > > > This patch also adds a new struct xfs_delattr_context, which we will use > > > to keep track of the current state of an attribute operation. The new > > > xfs_delattr_state enum is used to track various operations that are in > > > progress so that we know not to repeat them, and resume where we left > > > off before EAGAIN was returned to cycle out the transaction. Other > > > members take the place of local variables that need to retain their > > > values across multiple function recalls. See xfs_attr.h for a more > > > detailed diagram of the states. > > > > > > Signed-off-by: Allison Henderson <allison.henderson@xxxxxxxxxx> > > > --- > > > fs/xfs/libxfs/xfs_attr.c | 206 +++++++++++++++++++++++++++------------- > > > fs/xfs/libxfs/xfs_attr.h | 125 ++++++++++++++++++++++++ > > > fs/xfs/libxfs/xfs_attr_leaf.c | 2 +- > > > fs/xfs/libxfs/xfs_attr_remote.c | 48 ++++++---- > > > fs/xfs/libxfs/xfs_attr_remote.h | 2 +- > > > fs/xfs/xfs_attr_inactive.c | 2 +- > > > 6 files changed, 297 insertions(+), 88 deletions(-) > > > > > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > > > index 41accd5..4a73691 100644 > > > --- a/fs/xfs/libxfs/xfs_attr.c > > > +++ b/fs/xfs/libxfs/xfs_attr.c > > ... > > > @@ -221,6 +220,32 @@ xfs_attr_is_shortform( > > > ip->i_afp->if_nextents == 0); > > > } > > > +/* > > > + * Checks to see if a delayed attribute transaction should be rolled. If so, > > > + * also checks for a defer finish. Transaction is finished and rolled as > > > + * needed, and returns true of false if the delayed operation should continue. > > > + */ > > > > Outdated comment wrt to the return value. > Ok, will drop last line here > > > > > > +int > > > +xfs_attr_trans_roll( > > > + struct xfs_delattr_context *dac) > > > +{ > > > + struct xfs_da_args *args = dac->da_args; > > > + int error; > > > + > > > + if (dac->flags & XFS_DAC_DEFER_FINISH) { > > > + /* > > > + * The caller wants us to finish all the deferred ops so that we > > > + * avoid pinning the log tail with a large number of deferred > > > + * ops. > > > + */ > > > + dac->flags &= ~XFS_DAC_DEFER_FINISH; > > > + error = xfs_defer_finish(&args->trans); > > > + } else > > > + error = xfs_trans_roll_inode(&args->trans, args->dp); > > > + > > > + return error; > > > +} > > > + > > > STATIC int > > > xfs_attr_set_fmt( > > > struct xfs_da_args *args) > > ... > > > @@ -1232,70 +1264,114 @@ xfs_attr_node_remove_cleanup( > > > } > > > /* > > > - * Remove a name from a B-tree attribute list. > > > + * Remove the attribute specified in @args. > > > * > > > * This will involve walking down the Btree, and may involve joining > > > * leaf nodes and even joining intermediate nodes up to and including > > > * the root node (a special case of an intermediate node). > > > + * > > > + * This routine is meant to function as either an in-line or delayed operation, > > > + * and may return -EAGAIN when the transaction needs to be rolled. Calling > > > + * functions will need to handle this, and recall the function until a > > > + * successful error code is returned. > > > */ > > > -STATIC int > > > -xfs_attr_node_removename( > > > - struct xfs_da_args *args) > > > +int > > > +xfs_attr_remove_iter( > > > + struct xfs_delattr_context *dac) > > > { > > > - struct xfs_da_state *state; > > > - int retval, error; > > > - struct xfs_inode *dp = args->dp; > > > + struct xfs_da_args *args = dac->da_args; > > > + struct xfs_da_state *state = dac->da_state; > > > + int retval, error; > > > + struct xfs_inode *dp = args->dp; > > > trace_xfs_attr_node_removename(args); > > > - error = xfs_attr_node_removename_setup(args, &state); > > > - if (error) > > > - goto out; > > > + switch (dac->dela_state) { > > > + case XFS_DAS_UNINIT: > > > + if (!xfs_inode_hasattr(dp)) > > > + return -ENOATTR; > > > - /* > > > - * If there is an out-of-line value, de-allocate the blocks. > > > - * This is done before we remove the attribute so that we don't > > > - * overflow the maximum size of a transaction and/or hit a deadlock. > > > - */ > > > - if (args->rmtblkno > 0) { > > > - error = xfs_attr_rmtval_remove(args); > > > - if (error) > > > - goto out; > > > + if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) { > > > + ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > > > + return xfs_attr_shortform_remove(args); > > > + } > > > + > > > + if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) > > > + return xfs_attr_leaf_removename(args); > > > + > > > + /* fallthrough */ > > > + case XFS_DAS_RMTBLK: > > > + dac->dela_state = XFS_DAS_RMTBLK; > > > + > > > + if (!dac->da_state) { > > > + error = xfs_attr_node_removename_setup(dac); > > > + if (error) > > > + goto out; > > > > Do we need the goto here if _removename_setup() frees state on error (or > > is the latter change necessary)? > I think we can safely return here. Will update > > > > > > + } > > > + state = dac->da_state; > > > > Also, can this fold into the above if (!da_state) branch? Or maybe the > > whole setup branch pulled up into the UNINIT state? Not a big deal, but > > it does look a little out of place in the RMTBLK state. > Sure, it should be ok, there isnt any EAGAINs here, so it shouldnt make a > difference > > > > > > /* > > > - * Refill the state structure with buffers, the prior calls > > > - * released our buffers. > > > + * If there is an out-of-line value, de-allocate the blocks. > > > + * This is done before we remove the attribute so that we don't > > > + * overflow the maximum size of a transaction and/or hit a > > > + * deadlock. > > > */ > > > - error = xfs_attr_refillstate(state); > > > - if (error) > > > - goto out; > > > - } > > > - retval = xfs_attr_node_remove_cleanup(args, state); > > > + if (args->rmtblkno > 0) { > > > + /* > > > + * May return -EAGAIN. Remove blocks until > > > + * args->rmtblkno == 0 > > > + */ > > > + error = __xfs_attr_rmtval_remove(dac); > > > + if (error) > > > + break; > > > > I feel that the difference between a break and goto out might confuse > > some of the error handling. Right now, it looks like the exit path > > handles either scenario, so we could presumably do something like the > > following at the end of the function: > > > > if (error != -EAGAIN && state) > > xfs_da_state_free(state); > > return error; > > > > ... and just ditch the label. Alternatively we could retain the label above > > the state check, but just use it consistently throughout the function. > > > Either will work? I think I'd prefer the gotos over the breaks though, I > just think it reads easier. The switch is sort of big, so I think the gotos > make it a little more clear in that we're exiting the function without > having to skim all the way to the bottom. > Sounds reasonable to me as long as the error handling usage is consistent. Thanks. Brian > > Other than those few nits, this one looks pretty good to me. > Great, will update. Thanks! > > Allison > > > > > Brian > > > > > + > > > + /* > > > + * Refill the state structure with buffers, the prior > > > + * calls released our buffers. > > > + */ > > > + ASSERT(args->rmtblkno == 0); > > > + error = xfs_attr_refillstate(state); > > > + if (error) > > > + goto out; > > > + > > > + dac->flags |= XFS_DAC_DEFER_FINISH; > > > + return -EAGAIN; > > > + } > > > + > > > + retval = xfs_attr_node_remove_cleanup(args, state); > > > - /* > > > - * Check to see if the tree needs to be collapsed. > > > - */ > > > - if (retval && (state->path.active > 1)) { > > > - error = xfs_da3_join(state); > > > - if (error) > > > - goto out; > > > - error = xfs_defer_finish(&args->trans); > > > - if (error) > > > - goto out; > > > /* > > > - * Commit the Btree join operation and start a new trans. > > > + * Check to see if the tree needs to be collapsed. Set the flag > > > + * to indicate that the calling function needs to move the > > > + * shrink operation > > > */ > > > - error = xfs_trans_roll_inode(&args->trans, dp); > > > - if (error) > > > - goto out; > > > - } > > > + if (retval && (state->path.active > 1)) { > > > + error = xfs_da3_join(state); > > > + if (error) > > > + goto out; > > > - /* > > > - * If the result is small enough, push it all into the inode. > > > - */ > > > - if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) > > > - error = xfs_attr_node_shrink(args, state); > > > + dac->flags |= XFS_DAC_DEFER_FINISH; > > > + dac->dela_state = XFS_DAS_RM_SHRINK; > > > + return -EAGAIN; > > > + } > > > + > > > + /* fallthrough */ > > > + case XFS_DAS_RM_SHRINK: > > > + /* > > > + * If the result is small enough, push it all into the inode. > > > + */ > > > + if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) > > > + error = xfs_attr_node_shrink(args, state); > > > + > > > + break; > > > + default: > > > + ASSERT(0); > > > + error = -EINVAL; > > > + goto out; > > > + } > > > + if (error == -EAGAIN) > > > + return error; > > > out: > > > if (state) > > > xfs_da_state_free(state); > > > diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h > > > index 3e97a93..92a6a50 100644 > > > --- a/fs/xfs/libxfs/xfs_attr.h > > > +++ b/fs/xfs/libxfs/xfs_attr.h > > > @@ -74,6 +74,127 @@ struct xfs_attr_list_context { > > > }; > > > +/* > > > + * ======================================================================== > > > + * Structure used to pass context around among the delayed routines. > > > + * ======================================================================== > > > + */ > > > + > > > +/* > > > + * Below is a state machine diagram for attr remove operations. The XFS_DAS_* > > > + * states indicate places where the function would return -EAGAIN, and then > > > + * immediately resume from after being recalled by the calling function. States > > > + * marked as a "subroutine state" indicate that they belong to a subroutine, and > > > + * so the calling function needs to pass them back to that subroutine to allow > > > + * it to finish where it left off. But they otherwise do not have a role in the > > > + * calling function other than just passing through. > > > + * > > > + * xfs_attr_remove_iter() > > > + * │ > > > + * v > > > + * have attr to remove? ──n──> done > > > + * │ > > > + * y > > > + * │ > > > + * v > > > + * are we short form? ──y──> xfs_attr_shortform_remove ──> done > > > + * │ > > > + * n > > > + * │ > > > + * V > > > + * are we leaf form? ──y──> xfs_attr_leaf_removename ──> done > > > + * │ > > > + * n > > > + * │ > > > + * V > > > + * ┌── need to setup state? > > > + * │ │ > > > + * n y > > > + * │ │ > > > + * │ v > > > + * │ find attr and get state > > > + * │ attr has blks? ───n────??? > > > + * │ │ v > > > + * │ │ find and invalidate > > > + * │ y the blocks. mark > > > + * │ │ attr incomplete > > > + * │ ├────────────────┘ > > > + * └──────────┤ > > > + * │ > > > + * v > > > + * Have blks to remove? ─────y────??? > > > + * │ ^ remove the blks > > > + * │ │ │ > > > + * │ │ v > > > + * │ │ refill the state > > > + * n │ │ > > > + * │ │ v > > > + * │ │ XFS_DAS_RMTBLK > > > + * │ └───── re-enter with one > > > + * │ less blk to remove > > > + * │ > > > + * v > > > + * remove leaf and > > > + * update hash with > > > + * xfs_attr_node_remove_cleanup > > > + * │ > > > + * v > > > + * need to > > > + * shrink tree? ─n─??? > > > + * │ │ > > > + * y │ > > > + * │ │ > > > + * v │ > > > + * join leaf │ > > > + * │ │ > > > + * v │ > > > + * XFS_DAS_RM_SHRINK │ > > > + * │ │ > > > + * v │ > > > + * do the shrink │ > > > + * │ │ > > > + * v │ > > > + * free state <──┘ > > > + * │ > > > + * v > > > + * done > > > + * > > > + */ > > > + > > > +/* > > > + * Enum values for xfs_delattr_context.da_state > > > + * > > > + * These values are used by delayed attribute operations to keep track of where > > > + * they were before they returned -EAGAIN. A return code of -EAGAIN signals the > > > + * calling function to roll the transaction, and then recall the subroutine to > > > + * finish the operation. The enum is then used by the subroutine to jump back > > > + * to where it was and resume executing where it left off. > > > + */ > > > +enum xfs_delattr_state { > > > + XFS_DAS_UNINIT = 0, /* No state has been set yet */ > > > + XFS_DAS_RMTBLK, /* Removing remote blks */ > > > + XFS_DAS_RM_SHRINK, /* We are shrinking the tree */ > > > +}; > > > + > > > +/* > > > + * Defines for xfs_delattr_context.flags > > > + */ > > > +#define XFS_DAC_DEFER_FINISH 0x01 /* finish the transaction */ > > > + > > > +/* > > > + * Context used for keeping track of delayed attribute operations > > > + */ > > > +struct xfs_delattr_context { > > > + struct xfs_da_args *da_args; > > > + > > > + /* Used in xfs_attr_node_removename to roll through removing blocks */ > > > + struct xfs_da_state *da_state; > > > + > > > + /* Used to keep track of current state of delayed operation */ > > > + unsigned int flags; > > > + enum xfs_delattr_state dela_state; > > > +}; > > > + > > > /*======================================================================== > > > * Function prototypes for the kernel. > > > *========================================================================*/ > > > @@ -91,6 +212,10 @@ int xfs_attr_set(struct xfs_da_args *args); > > > int xfs_attr_set_args(struct xfs_da_args *args); > > > int xfs_has_attr(struct xfs_da_args *args); > > > int xfs_attr_remove_args(struct xfs_da_args *args); > > > +int xfs_attr_remove_iter(struct xfs_delattr_context *dac); > > > +int xfs_attr_trans_roll(struct xfs_delattr_context *dac); > > > bool xfs_attr_namecheck(const void *name, size_t length); > > > +void xfs_delattr_context_init(struct xfs_delattr_context *dac, > > > + struct xfs_da_args *args); > > > #endif /* __XFS_ATTR_H__ */ > > > diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c > > > index d6ef69a..3780141 100644 > > > --- a/fs/xfs/libxfs/xfs_attr_leaf.c > > > +++ b/fs/xfs/libxfs/xfs_attr_leaf.c > > > @@ -19,8 +19,8 @@ > > > #include "xfs_bmap_btree.h" > > > #include "xfs_bmap.h" > > > #include "xfs_attr_sf.h" > > > -#include "xfs_attr_remote.h" > > > #include "xfs_attr.h" > > > +#include "xfs_attr_remote.h" > > > #include "xfs_attr_leaf.h" > > > #include "xfs_error.h" > > > #include "xfs_trace.h" > > > diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c > > > index 48d8e9c..908521e7 100644 > > > --- a/fs/xfs/libxfs/xfs_attr_remote.c > > > +++ b/fs/xfs/libxfs/xfs_attr_remote.c > > > @@ -674,10 +674,12 @@ xfs_attr_rmtval_invalidate( > > > */ > > > int > > > xfs_attr_rmtval_remove( > > > - struct xfs_da_args *args) > > > + struct xfs_da_args *args) > > > { > > > - int error; > > > - int retval; > > > + int error; > > > + struct xfs_delattr_context dac = { > > > + .da_args = args, > > > + }; > > > trace_xfs_attr_rmtval_remove(args); > > > @@ -685,31 +687,29 @@ xfs_attr_rmtval_remove( > > > * Keep de-allocating extents until the remote-value region is gone. > > > */ > > > do { > > > - retval = __xfs_attr_rmtval_remove(args); > > > - if (retval && retval != -EAGAIN) > > > - return retval; > > > + error = __xfs_attr_rmtval_remove(&dac); > > > + if (error != -EAGAIN) > > > + break; > > > - /* > > > - * Close out trans and start the next one in the chain. > > > - */ > > > - error = xfs_trans_roll_inode(&args->trans, args->dp); > > > + error = xfs_attr_trans_roll(&dac); > > > if (error) > > > return error; > > > - } while (retval == -EAGAIN); > > > + } while (true); > > > - return 0; > > > + return error; > > > } > > > /* > > > * Remove the value associated with an attribute by deleting the out-of-line > > > - * buffer that it is stored on. Returns EAGAIN for the caller to refresh the > > > + * buffer that it is stored on. Returns -EAGAIN for the caller to refresh the > > > * transaction and re-call the function > > > */ > > > int > > > __xfs_attr_rmtval_remove( > > > - struct xfs_da_args *args) > > > + struct xfs_delattr_context *dac) > > > { > > > - int error, done; > > > + struct xfs_da_args *args = dac->da_args; > > > + int error, done; > > > /* > > > * Unmap value blocks for this attr. > > > @@ -719,12 +719,20 @@ __xfs_attr_rmtval_remove( > > > if (error) > > > return error; > > > - error = xfs_defer_finish(&args->trans); > > > - if (error) > > > - return error; > > > - > > > - if (!done) > > > + /* > > > + * We don't need an explicit state here to pick up where we left off. We > > > + * can figure it out using the !done return code. Calling function only > > > + * needs to keep recalling this routine until we indicate to stop by > > > + * returning anything other than -EAGAIN. The actual value of > > > + * attr->xattri_dela_state may be some value reminiscent of the calling > > > + * function, but it's value is irrelevant with in the context of this > > > + * function. Once we are done here, the next state is set as needed > > > + * by the parent > > > + */ > > > + if (!done) { > > > + dac->flags |= XFS_DAC_DEFER_FINISH; > > > return -EAGAIN; > > > + } > > > return error; > > > } > > > diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h > > > index 9eee615..002fd30 100644 > > > --- a/fs/xfs/libxfs/xfs_attr_remote.h > > > +++ b/fs/xfs/libxfs/xfs_attr_remote.h > > > @@ -14,5 +14,5 @@ int xfs_attr_rmtval_remove(struct xfs_da_args *args); > > > int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map, > > > xfs_buf_flags_t incore_flags); > > > int xfs_attr_rmtval_invalidate(struct xfs_da_args *args); > > > -int __xfs_attr_rmtval_remove(struct xfs_da_args *args); > > > +int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac); > > > #endif /* __XFS_ATTR_REMOTE_H__ */ > > > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c > > > index bfad669..aaa7e66 100644 > > > --- a/fs/xfs/xfs_attr_inactive.c > > > +++ b/fs/xfs/xfs_attr_inactive.c > > > @@ -15,10 +15,10 @@ > > > #include "xfs_da_format.h" > > > #include "xfs_da_btree.h" > > > #include "xfs_inode.h" > > > +#include "xfs_attr.h" > > > #include "xfs_attr_remote.h" > > > #include "xfs_trans.h" > > > #include "xfs_bmap.h" > > > -#include "xfs_attr.h" > > > #include "xfs_attr_leaf.h" > > > #include "xfs_quota.h" > > > #include "xfs_dir2.h" > > > -- > > > 2.7.4 > > > > > >