From: Dave Chinner <dchinner@xxxxxxxxxx> Add a group named "unreliable_in_parallel" to mark tests that do not give reliable results when multiple tests are run in parallel. Generally this happens with tests that are reliant on caching in some way, such as generating specific file layouts using buffered IO or expecting inodes to be cached in memory. These are perturbed by other tests running sync(), generating memory pressure, dropping caches, etc. Hence whether these tests pass or fail is wholly dependent on what tests are running at the same time, and hence randomly fail when nothing has actually gone wrong. Hence they are unreliable as regression tests when running tests in parallel, so we add them to the "unreliable_in_parallel" group and a parallel check can exclude this group. As tests are updated to be robust against external interference, they can be removed from the unreliable_in_parallel group. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> --- doc/group-names.txt | 1 + tests/generic/336 | 7 ++++++- tests/generic/561 | 8 +++++++- tests/xfs/177 | 8 ++++++-- tests/xfs/232 | 6 +++++- tests/xfs/237 | 8 +++++++- tests/xfs/243 | 7 +++++-- tests/xfs/300 | 8 ++++++-- tests/xfs/440 | 6 +++++- tests/xfs/527 | 5 ++++- tests/xfs/631 | 7 ++++++- tests/xfs/802 | 7 ++++++- 12 files changed, 64 insertions(+), 14 deletions(-) diff --git a/doc/group-names.txt b/doc/group-names.txt index ed886caac..f5bf79a56 100644 --- a/doc/group-names.txt +++ b/doc/group-names.txt @@ -138,6 +138,7 @@ trim FITRIM ioctl udf UDF functionality tests union tests from the unionmount test suite unlink O_TMPFILE unlinked files +unreliable_in_parallel randomly fail when run in parallel with other tests unshare fallocate FALLOC_FL_UNSHARE_RANGE v2log XFS v2 log format tests verity fsverity diff --git a/tests/generic/336 b/tests/generic/336 index 06391a93f..c874997e4 100755 --- a/tests/generic/336 +++ b/tests/generic/336 @@ -9,8 +9,13 @@ # file F2 from directory B into directory C, fsync inode F1, power fail and # remount the filesystem, file F2 exists and is located only in directory C. # + +# unreliable_in_parallel: external sync operations can change what is synced to +# the log before the flakey device drops writes. hence post-remount file +# contents can be different to what the test expects. + . ./common/preamble -_begin_fstest auto quick metadata log +_begin_fstest auto quick metadata log unreliable_in_parallel # Override the default cleanup function. _cleanup() diff --git a/tests/generic/561 b/tests/generic/561 index 3e931b1a7..602c235bc 100755 --- a/tests/generic/561 +++ b/tests/generic/561 @@ -7,8 +7,14 @@ # Dedup & random I/O race test, do multi-threads fsstress and dedupe on # same directory/files # + +# unreliable_in_parallel: duperemove is buggy. It can get stuck in endless +# fiemap mapping loops, and this seems to happen a *lot* when the system is +# under heavy load. when they do this, they don't die when they are supposed to +# and so have to be manually killed to end the test. + . ./common/preamble -_begin_fstest auto stress dedupe +_begin_fstest auto stress dedupe unreliable_in_parallel # Override the default cleanup function. _cleanup() diff --git a/tests/xfs/177 b/tests/xfs/177 index 773049524..22719ba1c 100755 --- a/tests/xfs/177 +++ b/tests/xfs/177 @@ -21,9 +21,13 @@ # Regrettably, there is no way to poke /only/ XFS inode reclamation directly, # so we're stuck with setting xfssyncd_centisecs to a low value and sleeping # while watching the internal inode cache counters. -# + +# unreliable_in_parallel: cache residency is affected by external drop caches +# operations. Hence counting inodes "in cache" often does not reflect what the +# test has actually done. + . ./common/preamble -_begin_fstest auto ioctl +_begin_fstest auto ioctl unreliable_in_parallel _cleanup() { diff --git a/tests/xfs/232 b/tests/xfs/232 index 0eea2c098..f0f3916e7 100755 --- a/tests/xfs/232 +++ b/tests/xfs/232 @@ -12,8 +12,12 @@ # - Wait for the reclaim to run. # - Write more and see how bad fragmentation is. # + +# unreliable_in_parallel: external sync operations affect what happens while +# the test is waiting for COW expiration. + . ./common/preamble -_begin_fstest auto quick clone fiemap prealloc +_begin_fstest auto quick clone fiemap prealloc unreliable_in_parallel # Override the default cleanup function. _cleanup() diff --git a/tests/xfs/237 b/tests/xfs/237 index f172aaf59..91f56d6c1 100755 --- a/tests/xfs/237 +++ b/tests/xfs/237 @@ -6,8 +6,14 @@ # # Test AIO DIO CoW behavior when the write temporarily fails. # + +# unreliable_in_parallel: external drop caches can co-incide with the error +# table being loaded, so the test being run fails with EIO trying to load the +# inode from disk instead of whatever operation it is supposed to fail on when +# the inode is already cached in memory. + . ./common/preamble -_begin_fstest auto quick clone eio +_begin_fstest auto quick clone eio unreliable_in_parallel # Override the default cleanup function. _cleanup() diff --git a/tests/xfs/243 b/tests/xfs/243 index 964e94e1d..f9cc2d50f 100755 --- a/tests/xfs/243 +++ b/tests/xfs/243 @@ -15,9 +15,12 @@ # 5. delalloc # - CoW across the halfway mark, starting with the unwritten extent. # - Check that the files are now different where we say they're different. -# + +# unreliable_in_parallel: external sync can affect the layout of the files being +# created, results in unreliable detection of delalloc extents. + . ./common/preamble -_begin_fstest auto quick clone punch prealloc +_begin_fstest auto quick clone punch prealloc unreliable_in_parallel # Import common functions. . ./common/filter diff --git a/tests/xfs/300 b/tests/xfs/300 index 3f0dbb9ac..c4c3b1ab8 100755 --- a/tests/xfs/300 +++ b/tests/xfs/300 @@ -5,9 +5,13 @@ # FS QA Test No. 300 # # Test xfs_fsr / exchangerange management of di_forkoff w/ selinux -# + +# unreliable_in_parallel: file layout appears to be perturbed by load related +# timing issues. Not 100% sure, but the backwards write does not reliably +# fragment the source file under heavy external load + . ./common/preamble -_begin_fstest auto fsr +_begin_fstest auto fsr unreliable_in_parallel # Import common functions. . ./common/filter diff --git a/tests/xfs/440 b/tests/xfs/440 index 0cc679aeb..c0b6756ba 100755 --- a/tests/xfs/440 +++ b/tests/xfs/440 @@ -8,8 +8,12 @@ # a file that has CoW reservations and no dirty pages. The reservations # should shift over to the new owner, but they do not. # + +# unreliable_in_parallel: external sync(1) and/or drop caches can reclaim inodes +# and free post-eof space, resulting in lower than expected block counts. + . ./common/preamble -_begin_fstest auto quick clone quota +_begin_fstest auto quick clone quota unreliable_in_parallel # Import common functions. . ./common/reflink diff --git a/tests/xfs/527 b/tests/xfs/527 index 2ef428c25..0d06b128c 100755 --- a/tests/xfs/527 +++ b/tests/xfs/527 @@ -14,8 +14,11 @@ # xfs: fix incorrect root dquot corruption error when switching group/project # quota types +# unreliable_in_parallel: dmesg check can pick up corruptions from other tests. +# Need to filter corruption reports by short scratch dev name. + . ./common/preamble -_begin_fstest auto quick quota +_begin_fstest auto quick quota unreliable_in_parallel # Import common functions. . ./common/quota diff --git a/tests/xfs/631 b/tests/xfs/631 index 4d79b821f..319995f81 100755 --- a/tests/xfs/631 +++ b/tests/xfs/631 @@ -7,8 +7,13 @@ # Post-EOF preallocation defeat test for direct I/O with extent size hints. # +# unreliable_in_parallel: external cache drops can result in the extent size +# being truncated as the inode is evicted from cache between writes. This can +# increase the number of extents significantly beyond what would be expected +# from the extent size hint. + . ./common/preamble -_begin_fstest auto quick prealloc rw +_begin_fstest auto quick prealloc rw unreliable_in_parallel . ./common/filter diff --git a/tests/xfs/802 b/tests/xfs/802 index ea09817fd..fc4767acb 100755 --- a/tests/xfs/802 +++ b/tests/xfs/802 @@ -8,8 +8,13 @@ # filesystem, and that we can read the health reports after the fact. IOWs, # this is basic testing for the systemd background services. # + +# unreliable_in_parallel: this appears to try to run scrub services on all +# mounted filesystems - that's aproblem when there are a hundred other test +# filesystems mounted running other tests... + . ./common/preamble -_begin_fstest auto scrub +_begin_fstest auto scrub unreliable_in_parallel _cleanup() { -- 2.45.2