Re: [PATCH RFC] xfs: refactor xfs_attr_set() into incremental components

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 1/7/21 9:14 AM, Brian Foster wrote:
POC to explore whether xfs_attr_set() can be refactored into
incremental components to facilitate isolated state management.

Not-Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
---

Hi all,

This is a followup to the ongoing discussion with Allison around delayed
attrs and the xfs_attr_set() path in particular. It is a continuation of
the RFC patch posted here[1]. One of the things that concerns me about
the current approach is not so much the state management, but the
resulting structure that the current xattr implementation imposes on the
state machine code. Earlier on in this effort, we discussed an objective
to keep the state management code as isolated as possible (ideally to a
single level) from the functional xattr code. The purpose of this RFC is
to explore whether the existing code can be refactored in such a way to
accommodate that.

Note that this is patch is not intended to be functional. It is compile
tested only, takes intentional shortcuts, and is intended only to
illustrate an idea / potential approach. IOW, this was essentially a
blitz through the set codepath to try and determine whether this kind of
approach was feasible, not necessarily an attempt to implement it
correctly.

Also note that some code has been borrowed from Allison's series, but
otherwise a crude state machine mechanism has been hacked in just to
support the associated refactoring. This state machine code is not
intended to replace the broader mechanism Allison has implemented. I
needed something to support breaking down the code into components and
didn't want to pull in a world of infrastructure, so I hacked in the
bare minimal mechanism necessary to support that effort. This state
management code should ultimately be thrown away and is not the focus of
the patch. I do have a local git branch with more granular commits, but
it's kind of a mess atm so I squashed this down to a single patch since
it is primarily intended to generate discussion.

The flow of development was generally as follows:

1. Implement a basic transaction rolling and function reentry loop
(based on -EAGAIN). This is the primary loop in xfs_attr_set_iter() and
based on Allison's code.

2. Tack on a crude mechanism to implement incremental states. This is
essentially the switch statement inside the aforementioned loop. Any
state that returns -EAGAIN is reentered after a transaction roll.
Otherwise a non-error return increments to the next state. Note that
some states are semi-artificial in that they don't ever repeat, so could
potentially be optimized away.

3. With the above infrastructure in place, incrementally convert the
existing xattr set implementation into reentrant components. This is
accomplished by peeling off bits of the existing implemention that are
currently separated by explicit transaction rolls and working them into
the state management loop. On termination of the loop, we call into the
remainder of the explicit rolling implementation to maintain
functionality.

The state machine consists of the following high level states:

0. Set the xattr fork format and add the attr name. This state repeats
as the fork is converted into something that can hold the requested
xattr. If the set completes in shortform format, the entire operation
completes.
1. Find a hole for a remote value, if necessary. This state does not
repeat or roll the transaction.
2. Allocate blocks for the remote value, if necessary. This state
repeats until all required blocks are allocated.
3. Write the remote value, if necessary. This state does not repeat or
roll.
4. Clear or flip the inactive flag depending on whether the set is a
rename. If !rename, the flag is cleared and the set returns. Otherwise,
the flag is flipped to the old xattr and we progress to the next state.
5. Invalidate remote blocks for the old xattr, if necessary. This state
does not roll or repeat.
6. Remove remote blocks from the old xattr. This state repeats until all
extents for the old remote value are removed.

Finally, we fall back into what remains of the existing leaf/node
implementations. At this point this consists of removing the old xattr
name and some final attr fork format cleanup, if necessary. This code
should ultimately be reworked as well, but I didn't see any transaction
rolls through here and so decided it was sufficient to stop at this
point for the purpose of the RFC.

To me, the primary takeaways from this are that it seems reasonably
possible to clean up the xattr set codepath such that we don't require a
large number of per-format states and that we can do so in a way that
state management code is isolated to a single function (or single switch
statement). This is demonstrated by explicitly containing throwaway
state management code within xfs_attr_set_iter() and refactoring the
functional code into components that either complete (return 0) or
repeat (return -EAGAIN). Though it may not be apparent from the squashed
together RFC patch, this also suggests a more incremental development
approach is possible, as this patch was developed in a manner that
implemented one (or several related) states at a time with the intent to
maintain functionality at each step. Thoughts? >
Brian

Alrighty, I think I see what you mean to illustrate here. Maybe I can use what you have here as a sort of guide to get a functional version working. I think it may look a little cleaner once we get it there since a lot of this is a bit of a substitute for the bigger set. I will see if I can work through it and post back. Or if something doesnt work, I'll make of note of it.

Thank you for all your help, I know it's a really complicated set, but it feels like we're makeing progress :-)

Allison



[1] https://urldefense.com/v3/__https://lore.kernel.org/linux-xfs/20201218072917.16805-1-allison.henderson@xxxxxxxxxx/T/*m3fcf7be3a8154ab98ddc9e1d45bc764d79d39dc3__;Iw!!GqivPVa7Brio!Le5Q-6GnjBKTG_b64Oh7dGImvE5RQbKK0mrqUaxi0Bl7bWhrtqDKXuIh_j3_vIYI5ibg$

  fs/xfs/libxfs/xfs_attr.c        | 361 ++++++++++++--------------------
  fs/xfs/libxfs/xfs_attr_remote.c |  67 ++----
  fs/xfs/libxfs/xfs_attr_remote.h |   4 +-
  3 files changed, 154 insertions(+), 278 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index fd8e6418a0d3..216055b6ad0d 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -58,6 +58,9 @@ STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
  				 struct xfs_da_state **state);
  STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
  STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
+STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *, struct xfs_buf *);
+STATIC int xfs_attr_node_addname_work(struct xfs_da_args *);
+STATIC void xfs_attr_restore_rmt_blk(struct xfs_da_args *args);
int
  xfs_inode_hasattr(
@@ -216,118 +219,153 @@ xfs_attr_is_shortform(
  		ip->i_afp->if_nextents == 0);
  }
-/*
- * Attempts to set an attr in shortform, or converts short form to leaf form if
- * there is not enough room.  If the attr is set, the transaction is committed
- * and set to NULL.
- */
-STATIC int
-xfs_attr_set_shortform(
+int
+xfs_attr_set_fmt(
  	struct xfs_da_args	*args,
-	struct xfs_buf		**leaf_bp)
+	bool			*done)
  {
  	struct xfs_inode	*dp = args->dp;
-	int			error, error2 = 0;
+	struct xfs_buf		*leaf_bp = NULL;
+	int			error = 0;
- /*
-	 * Try to add the attr to the attribute list in the inode.
-	 */
-	error = xfs_attr_try_sf_addname(dp, args);
-	if (error != -ENOSPC) {
-		error2 = xfs_trans_commit(args->trans);
-		args->trans = NULL;
-		return error ? error : error2;
+	if (xfs_attr_is_shortform(dp)) {
+		error = xfs_attr_try_sf_addname(dp, args);
+		if (!error)
+			*done = true;
+		if (error != -ENOSPC)
+			return error;
+
+		error = xfs_attr_shortform_to_leaf(args, &leaf_bp);
+		if (error)
+			return error;
+		return -EAGAIN;
  	}
-	/*
-	 * It won't fit in the shortform, transform to a leaf block.  GROT:
-	 * another possible req'mt for a double-split btree op.
-	 */
-	error = xfs_attr_shortform_to_leaf(args, leaf_bp);
-	if (error)
-		return error;
- /*
-	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
-	 * push cannot grab the half-baked leaf buffer and run into problems
-	 * with the write verifier. Once we're done rolling the transaction we
-	 * can release the hold and add the attr to the leaf.
-	 */
-	xfs_trans_bhold(args->trans, *leaf_bp);
-	error = xfs_defer_finish(&args->trans);
-	xfs_trans_bhold_release(args->trans, *leaf_bp);
-	if (error) {
-		xfs_trans_brelse(args->trans, *leaf_bp);
-		return error;
+	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
+		struct xfs_buf	*bp = NULL;
+
+		error = xfs_attr_leaf_try_add(args, bp);
+		if (error != -ENOSPC)
+			return error;
+
+		error = xfs_attr3_leaf_to_node(args);
+		if (error)
+			return error;
+		return -EAGAIN;
  	}
- return 0;
+	return xfs_attr_node_addname(args);
  }
/*
   * Set the attribute specified in @args.
   */
  int
-xfs_attr_set_args(
+__xfs_attr_set_args(
  	struct xfs_da_args	*args)
  {
  	struct xfs_inode	*dp = args->dp;
-	struct xfs_buf          *leaf_bp = NULL;
  	int			error = 0;
- /*
-	 * If the attribute list is already in leaf format, jump straight to
-	 * leaf handling.  Otherwise, try to add the attribute to the shortform
-	 * list; if there's no room then convert the list to leaf format and try
-	 * again.
-	 */
-	if (xfs_attr_is_shortform(dp)) {
-
-		/*
-		 * If the attr was successfully set in shortform, the
-		 * transaction is committed and set to NULL.  Otherwise, is it
-		 * converted from shortform to leaf, and the transaction is
-		 * retained.
-		 */
-		error = xfs_attr_set_shortform(args, &leaf_bp);
-		if (error || !args->trans)
-			return error;
-	}
-
  	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
  		error = xfs_attr_leaf_addname(args);
-		if (error != -ENOSPC)
+		if (error)
  			return error;
+	}
- /*
-		 * Promote the attribute list to the Btree format.
-		 */
-		error = xfs_attr3_leaf_to_node(args);
-		if (error)
+	error = xfs_attr_node_addname_work(args);
+	return error;
+}
+
+int
+xfs_attr_set_iter(
+	struct xfs_da_args	*args,
+	bool			*done)
+{
+	int			error;
+	int			state = 0;
+	xfs_dablk_t		lblkno;
+	int			blkcnt;
+
+	do {
+		switch (state) {
+		case 0:	/* SET_FMT */
+			error = xfs_attr_set_fmt(args, done);
+			if (*done)
+				return error;
+			break;
+		case 1: /* RMT_FIND_HOLE */
+			if (args->rmtblkno <= 0)
+				break;
+
+			trace_xfs_attr_rmtval_set(args);
+			error = xfs_attr_rmt_find_hole(args);
+			lblkno = (xfs_dablk_t)args->rmtblkno;
+			blkcnt = args->rmtblkcnt;
+			state++;
+			continue;
+		case 2: /* RMTVAL_ALLOC */
+			if (args->rmtblkno <= 0)
+				break;
+			error = xfs_attr_rmtval_set(args, &lblkno, &blkcnt);
+			break;
+		case 3: /* RMTVAL_SET */
+			if (args->rmtblkno <= 0)
+				break;
+			error = xfs_attr_rmtval_set_value(args);
+			state++;
+			continue;
+		case 4:	/* SET_FLAG */
+			if (args->op_flags & XFS_DA_OP_RENAME) {
+				error = xfs_attr3_leaf_flipflags(args);
+			} else {
+				if (args->rmtblkno > 0)
+					error = xfs_attr3_leaf_clearflag(args);
+				return error;
+			}
+			break;
+		case 5: /* RMT_INVALIDATE */
+			xfs_attr_restore_rmt_blk(args);
+			if (args->rmtblkno)
+				error = xfs_attr_rmtval_invalidate(args);
+			state++;
+			continue;
+		case 6: /* RMT_REMOVE */
+			error = __xfs_attr_rmtval_remove(args);
+			break;
+		default:
  			return error;
+		};
- /*
-		 * Finish any deferred work items and roll the transaction once
-		 * more.  The goal here is to call node_addname with the inode
-		 * and transaction in the same state (inode locked and joined,
-		 * transaction clean) no matter how we got to this step.
-		 */
-		error = xfs_defer_finish(&args->trans);
-		if (error)
+		if (!error)
+			state++;
+		else if (error != -EAGAIN)
  			return error;
- /*
-		 * Commit the current trans (including the inode) and
-		 * start a new one.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
+		error = xfs_defer_finish(&args->trans);
  		if (error)
-			return error;
-	}
+			break;
+		error = xfs_trans_roll_inode(&args->trans, args->dp);
+	} while (!error);
- error = xfs_attr_node_addname(args);
  	return error;
  }
+int
+xfs_attr_set_args(
+	struct xfs_da_args	*args)
+
+{
+	int			error;
+	bool			done = false;
+
+	error = xfs_attr_set_iter(args, &done);
+	if (error || done)
+		return error;
+
+	return __xfs_attr_set_args(args);
+}
+
  /*
   * Return EEXIST if attr is found, or ENOATTR if not
   */
@@ -676,76 +714,6 @@ xfs_attr_leaf_addname(
trace_xfs_attr_leaf_addname(args); - error = xfs_attr_leaf_try_add(args, bp);
-	if (error)
-		return error;
-
-	/*
-	 * Commit the transaction that added the attr name so that
-	 * later routines can manage their own transactions.
-	 */
-	error = xfs_trans_roll_inode(&args->trans, dp);
-	if (error)
-		return error;
-
-	/*
-	 * If there was an out-of-line value, allocate the blocks we
-	 * identified for its storage and copy the value.  This is done
-	 * after we create the attribute so that we don't overflow the
-	 * maximum size of a transaction and/or hit a deadlock.
-	 */
-	if (args->rmtblkno > 0) {
-		error = xfs_attr_rmtval_set(args);
-		if (error)
-			return error;
-	}
-
-	if (!(args->op_flags & XFS_DA_OP_RENAME)) {
-		/*
-		 * Added a "remote" value, just clear the incomplete flag.
-		 */
-		if (args->rmtblkno > 0)
-			error = xfs_attr3_leaf_clearflag(args);
-
-		return error;
-	}
-
-	/*
-	 * If this is an atomic rename operation, we must "flip" the incomplete
-	 * flags on the "new" and "old" attribute/value pairs so that one
-	 * disappears and one appears atomically.  Then we must remove the "old"
-	 * attribute/value pair.
-	 *
-	 * In a separate transaction, set the incomplete flag on the "old" attr
-	 * and clear the incomplete flag on the "new" attr.
-	 */
-
-	error = xfs_attr3_leaf_flipflags(args);
-	if (error)
-		return error;
-	/*
-	 * Commit the flag value change and start the next trans in series.
-	 */
-	error = xfs_trans_roll_inode(&args->trans, args->dp);
-	if (error)
-		return error;
-
-	/*
-	 * Dismantle the "old" attribute/value pair by removing a "remote" value
-	 * (if it exists).
-	 */
-	xfs_attr_restore_rmt_blk(args);
-
-	if (args->rmtblkno) {
-		error = xfs_attr_rmtval_invalidate(args);
-		if (error)
-			return error;
-
-		error = xfs_attr_rmtval_remove(args);
-		if (error)
-			return error;
-	}
-
  	/*
  	 * Read in the block containing the "old" attr, then remove the "old"
  	 * attr from that block (neat, huh!)
@@ -923,7 +891,7 @@ xfs_attr_node_addname(
  	 * Fill in bucket of arguments/results/context to carry around.
  	 */
  	dp = args->dp;
-restart:
+
  	/*
  	 * Search to see if name already exists, and get back a pointer
  	 * to where it should go.
@@ -967,21 +935,10 @@ xfs_attr_node_addname(
  			xfs_da_state_free(state);
  			state = NULL;
  			error = xfs_attr3_leaf_to_node(args);
-			if (error)
-				goto out;
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				goto out;
-
-			/*
-			 * Commit the node conversion and start the next
-			 * trans in the chain.
-			 */
-			error = xfs_trans_roll_inode(&args->trans, dp);
  			if (error)
  				goto out;
- goto restart;
+			return -EAGAIN;
  		}
/*
@@ -993,9 +950,6 @@ xfs_attr_node_addname(
  		error = xfs_da3_split(state);
  		if (error)
  			goto out;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			goto out;
  	} else {
  		/*
  		 * Addition succeeded, update Btree hashvals.
@@ -1010,70 +964,23 @@ xfs_attr_node_addname(
  	xfs_da_state_free(state);
  	state = NULL;
- /*
-	 * Commit the leaf addition or btree split and start the next
-	 * trans in the chain.
-	 */
-	error = xfs_trans_roll_inode(&args->trans, dp);
-	if (error)
-		goto out;
-
-	/*
-	 * If there was an out-of-line value, allocate the blocks we
-	 * identified for its storage and copy the value.  This is done
-	 * after we create the attribute so that we don't overflow the
-	 * maximum size of a transaction and/or hit a deadlock.
-	 */
-	if (args->rmtblkno > 0) {
-		error = xfs_attr_rmtval_set(args);
-		if (error)
-			return error;
-	}
-
-	if (!(args->op_flags & XFS_DA_OP_RENAME)) {
-		/*
-		 * Added a "remote" value, just clear the incomplete flag.
-		 */
-		if (args->rmtblkno > 0)
-			error = xfs_attr3_leaf_clearflag(args);
-		retval = error;
-		goto out;
-	}
+	return 0;
- /*
-	 * If this is an atomic rename operation, we must "flip" the incomplete
-	 * flags on the "new" and "old" attribute/value pairs so that one
-	 * disappears and one appears atomically.  Then we must remove the "old"
-	 * attribute/value pair.
-	 *
-	 * In a separate transaction, set the incomplete flag on the "old" attr
-	 * and clear the incomplete flag on the "new" attr.
-	 */
-	error = xfs_attr3_leaf_flipflags(args);
-	if (error)
-		goto out;
-	/*
-	 * Commit the flag value change and start the next trans in series
-	 */
-	error = xfs_trans_roll_inode(&args->trans, args->dp);
+out:
+	if (state)
+		xfs_da_state_free(state);
  	if (error)
-		goto out;
-
-	/*
-	 * Dismantle the "old" attribute/value pair by removing a "remote" value
-	 * (if it exists).
-	 */
-	xfs_attr_restore_rmt_blk(args);
-
-	if (args->rmtblkno) {
-		error = xfs_attr_rmtval_invalidate(args);
-		if (error)
-			return error;
+		return error;
+	return retval;
+}
- error = xfs_attr_rmtval_remove(args);
-		if (error)
-			return error;
-	}
+STATIC int
+xfs_attr_node_addname_work(
+	struct xfs_da_args	*args)
+{
+	struct xfs_da_state	*state;
+	struct xfs_da_state_blk	*blk;
+	int			retval, error;
/*
  	 * Re-find the "old" attribute entry after any split ops. The INCOMPLETE
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 48d8e9caf86f..2c02875a4930 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -441,7 +441,7 @@ xfs_attr_rmtval_get(
   * Find a "hole" in the attribute address space large enough for us to drop the
   * new attribute's value into
   */
-STATIC int
+int
  xfs_attr_rmt_find_hole(
  	struct xfs_da_args	*args)
  {
@@ -468,7 +468,7 @@ xfs_attr_rmt_find_hole(
  	return 0;
  }
-STATIC int
+int
  xfs_attr_rmtval_set_value(
  	struct xfs_da_args	*args)
  {
@@ -567,64 +567,31 @@ xfs_attr_rmtval_stale(
   */
  int
  xfs_attr_rmtval_set(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	xfs_dablk_t		*lblkno,
+	int			*blkcnt)
  {
  	struct xfs_inode	*dp = args->dp;
  	struct xfs_bmbt_irec	map;
-	xfs_dablk_t		lblkno;
-	int			blkcnt;
  	int			nmap;
  	int			error;
- trace_xfs_attr_rmtval_set(args);
-
-	error = xfs_attr_rmt_find_hole(args);
+	nmap = 1;
+	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)*lblkno,
+			  *blkcnt, XFS_BMAPI_ATTRFORK, args->total, &map,
+			  &nmap);
  	if (error)
  		return error;
- blkcnt = args->rmtblkcnt;
-	lblkno = (xfs_dablk_t)args->rmtblkno;
-	/*
-	 * Roll through the "value", allocating blocks on disk as required.
-	 */
-	while (blkcnt > 0) {
-		/*
-		 * Allocate a single extent, up to the size of the value.
-		 *
-		 * Note that we have to consider this a data allocation as we
-		 * write the remote attribute without logging the contents.
-		 * Hence we must ensure that we aren't using blocks that are on
-		 * the busy list so that we don't overwrite blocks which have
-		 * recently been freed but their transactions are not yet
-		 * committed to disk. If we overwrite the contents of a busy
-		 * extent and then crash then the block may not contain the
-		 * correct metadata after log recovery occurs.
-		 */
-		nmap = 1;
-		error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)lblkno,
-				  blkcnt, XFS_BMAPI_ATTRFORK, args->total, &map,
-				  &nmap);
-		if (error)
-			return error;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
-
-		ASSERT(nmap == 1);
-		ASSERT((map.br_startblock != DELAYSTARTBLOCK) &&
-		       (map.br_startblock != HOLESTARTBLOCK));
-		lblkno += map.br_blockcount;
-		blkcnt -= map.br_blockcount;
+	ASSERT(nmap == 1);
+	ASSERT((map.br_startblock != DELAYSTARTBLOCK) &&
+	       (map.br_startblock != HOLESTARTBLOCK));
+	*lblkno += map.br_blockcount;
+	*blkcnt -= map.br_blockcount;
- /*
-		 * Start the next trans in the chain.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
-	}
-
-	return xfs_attr_rmtval_set_value(args);
+	if (*blkcnt)
+		return -EAGAIN;
+	return 0;
  }
/*
diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
index 9eee615da156..74d768dd8afa 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.h
+++ b/fs/xfs/libxfs/xfs_attr_remote.h
@@ -7,9 +7,11 @@
  #define	__XFS_ATTR_REMOTE_H__
int xfs_attr3_rmt_blocks(struct xfs_mount *mp, int attrlen);
+int xfs_attr_rmt_find_hole(struct xfs_da_args *args);
int xfs_attr_rmtval_get(struct xfs_da_args *args);
-int xfs_attr_rmtval_set(struct xfs_da_args *args);
+int xfs_attr_rmtval_set(struct xfs_da_args *args, xfs_dablk_t *, int *);
+int xfs_attr_rmtval_set_value(struct xfs_da_args *args);
  int xfs_attr_rmtval_remove(struct xfs_da_args *args);
  int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
  		xfs_buf_flags_t incore_flags);




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux