From: Darrick J. Wong <djwong@xxxxxxxxxx> A while ago, I decided to make phase 4 check the summary counters before it starts any other repairs, having observed that repairs of primary metadata can fail because the summary counters (incorrectly) claim that there aren't enough free resources in the filesystem. However, if problems are found in the summary counters, the repair work will be run as part of the AG 0 repairs, which means that it runs concurrently with other scrubbers. This doesn't quite get us to the intended goal, so try to fix the scrubbers ahead of time. If that fails, tough, we'll get back to it in phase 7 if scrub gets that far. Fixes: cbaf1c9d91a0 ("xfs_scrub: check summary counters") Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> --- scrub/phase4.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/scrub/phase4.c b/scrub/phase4.c index 789208398b4..f14c3ad58f2 100644 --- a/scrub/phase4.c +++ b/scrub/phase4.c @@ -129,6 +129,7 @@ phase4_func( struct scrub_ctx *ctx) { struct xfs_fsop_geom fsgeom; + struct action_list alist; int ret; if (!have_action_items(ctx)) @@ -136,11 +137,13 @@ phase4_func( /* * Check the summary counters early. Normally we do this during phase - * seven, but some of the cross-referencing requires fairly-accurate - * counters, so counter repairs have to be put on the list now so that - * they get fixed before we stop retrying unfixed metadata repairs. + * seven, but some of the cross-referencing requires fairly accurate + * summary counters. Check and try to repair them now to minimize the + * chance that repairs of primary metadata fail due to secondary + * metadata. If repairs fails, we'll come back during phase 7. */ - ret = scrub_fs_counters(ctx, &ctx->action_lists[0]); + action_list_init(&alist); + ret = scrub_fs_counters(ctx, &alist); if (ret) return ret; @@ -155,11 +158,18 @@ phase4_func( return ret; if (fsgeom.sick & XFS_FSOP_GEOM_SICK_QUOTACHECK) { - ret = scrub_quotacheck(ctx, &ctx->action_lists[0]); + ret = scrub_quotacheck(ctx, &alist); if (ret) return ret; } + /* Repair counters before starting on the rest. */ + ret = action_list_process(ctx, -1, &alist, + XRM_REPAIR_ONLY | XRM_NOPROGRESS); + if (ret) + return ret; + action_list_discard(&alist); + ret = repair_everything(ctx); if (ret) return ret;