On 11/11/21 4:17 PM, Dave Chinner wrote:
On Thu, Nov 11, 2021 at 12:17:15AM +0000, Catherine Hoang wrote:
Add an error tag on xfs_da3_split to test log attribute recovery
and replay.
Signed-off-by: Catherine Hoang <catherine.hoang@xxxxxxxxxx>
---
fs/xfs/libxfs/xfs_da_btree.c | 6 ++++++
fs/xfs/libxfs/xfs_errortag.h | 4 +++-
fs/xfs/xfs_error.c | 3 +++
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index dd7a2dbce1d1..000101783648 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -22,6 +22,7 @@
#include "xfs_trace.h"
#include "xfs_buf_item.h"
#include "xfs_log.h"
+#include "xfs_errortag.h"
/*
* xfs_da_btree.c
@@ -482,6 +483,11 @@ xfs_da3_split(
trace_xfs_da_split(state->args);
+ if (XFS_TEST_ERROR(false, state->mp, XFS_ERRTAG_LEAF_SPLIT)) {
+ error = -EIO;
+ return error;
+ }
+
/*
* Walk back up the tree splitting/inserting/adjusting as necessary.
* If we need to insert and there isn't room, split the node, then
diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
index c15d2340220c..31aeeb94dd5b 100644
--- a/fs/xfs/libxfs/xfs_errortag.h
+++ b/fs/xfs/libxfs/xfs_errortag.h
@@ -60,7 +60,8 @@
#define XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT 37
#define XFS_ERRTAG_AG_RESV_FAIL 38
#define XFS_ERRTAG_LARP 39
-#define XFS_ERRTAG_MAX 40
+#define XFS_ERRTAG_LEAF_SPLIT 40
What leaf is being split?
This looks to be a DA btree split that it the error injection is
being applied to, not a allocbt, rmapbt, etc split. And it's not
really a "leaf split" because xfs_da3_split() walks the entire path
back up the tree splitting nodes as well.
So, really, it's a da tree split, not a generic "leaf split" error
injection point.
And, I suspect this won't always work as intended, because it can
trigger on directory operations as well as attribute ops. Hence it
could be difficult to direct this for testing attr fork operations
during stress at the attr fork....
Well, the intent isnt to use these during a stress test. The idea it to
set the tag right before an attribute operation, and then see if the
attribute is played back as it should. Having the different points of
failure demonstrates that the replay is successful even when the failure
occurs during the different attribute forms.
That context of being in an attr operation is set up by the testcase.
We create the dirs, and files first and then set the error tag before
the attr operation. So these error tags wouldn't otherwise trigger on a
dir operation unless for some reason we wrote another test case that did
that. But that doesn't seem like it would be a very meaningful
testcase? Or it would have a different objective that it was trying to
accomplish say the least.
Perhaps all we need here is a name scheme to make the intention clear?
How do people feel about this naming scheme?
XFS_ERRTAG_LARP # Test log attr replay
XFS_ERRTAG_LARP_LEAF_SPLIT # Test log attr replay on leaf split
XFS_ERRTAG_LARP_LEAF_TO_NODE # Test log attr replay on leaf to node
That would also help to make clear that these are meant to trigger on
the attribute code path for a replay. Let me know what people think.
Thanks!
Allison
Cheers,
Dave.