Hi all, Currently, phase 4 of xfs_scrub uses per-AG repair item lists to schedule repair work across a thread pool. This scheme is suboptimal when most of the repairs involve a single AG because all the work gets dumped on a single pool thread. Instead, we should create a thread pool with the same number of workers as CPUs, and dispatch individual repair tickets as separate work items to maximize parallelization. However, we also need to ensure that repairs to space metadata and file metadata are kept in separate queues because file repairs generally depend on correctness of space metadata. If you're going to start using this mess, you probably ought to just pull from my git trees, which are linked below. This is an extraordinary way to destroy everything. Enjoy! Comments and questions are, as always, welcome. --D xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-repair-scheduling --- include/list.h | 14 +++ libfrog/ptvar.c | 9 ++ libfrog/ptvar.h | 4 + scrub/counter.c | 2 scrub/descr.c | 2 scrub/phase1.c | 15 ++- scrub/phase2.c | 23 ++++- scrub/phase3.c | 106 ++++++++++++++-------- scrub/phase4.c | 240 ++++++++++++++++++++++++++++++++++++------------- scrub/phase7.c | 2 scrub/read_verify.c | 2 scrub/repair.c | 172 +++++++++++++++++++++++------------ scrub/repair.h | 37 ++++++-- scrub/scrub.c | 5 + scrub/scrub.h | 10 ++ scrub/scrub_private.h | 2 scrub/xfs_scrub.h | 3 - 17 files changed, 465 insertions(+), 183 deletions(-)