On 2024-04-17 16:50, Hans Holmberg wrote: > On 2024-04-17 16:07, Zorro Lang wrote: >> On Wed, Apr 17, 2024 at 01:21:39PM +0000, Hans Holmberg wrote: >>> On 2024-04-17 14:43, Zorro Lang wrote: >>>> On Tue, Apr 16, 2024 at 11:54:37AM -0700, Darrick J. Wong wrote: >>>>> On Tue, Apr 16, 2024 at 09:07:43AM +0000, Hans Holmberg wrote: >>>>>> +Zorro (doh!) >>>>>> >>>>>> On 2024-04-15 13:23, Hans Holmberg wrote: >>>>>>> This test stresses garbage collection for file systems by first filling >>>>>>> up a scratch mount to a specific usage point with files of random size, >>>>>>> then doing overwrites in parallel with deletes to fragment the backing >>>>>>> storage, forcing reclaim. >>>>>>> >>>>>>> Signed-off-by: Hans Holmberg <hans.holmberg@xxxxxxx> >>>>>>> --- >>>>>>> >>>>>>> Test results in my setup (kernel 6.8.0-rc4+) >>>>>>> f2fs on zoned nullblk: pass (77s) >>>>>>> f2fs on conventional nvme ssd: pass (13s) >>>>>>> btrfs on zoned nublk: fails (-ENOSPC) >>>>>>> btrfs on conventional nvme ssd: fails (-ENOSPC) >>>>>>> xfs on conventional nvme ssd: pass (8s) >>>>>>> >>>>>>> Johannes(cc) is working on the btrfs ENOSPC issue. >>>>>>> >>>>>>> tests/generic/744 | 124 ++++++++++++++++++++++++++++++++++++++++++ >>>>>>> tests/generic/744.out | 6 ++ >>>>>>> 2 files changed, 130 insertions(+) >>>>>>> create mode 100755 tests/generic/744 >>>>>>> create mode 100644 tests/generic/744.out >>>>>>> >>>>>>> diff --git a/tests/generic/744 b/tests/generic/744 >>>>>>> new file mode 100755 >>>>>>> index 000000000000..2c7ab76bf8b1 >>>>>>> --- /dev/null >>>>>>> +++ b/tests/generic/744 >>>>>>> @@ -0,0 +1,124 @@ >>>>>>> +#! /bin/bash >>>>>>> +# SPDX-License-Identifier: GPL-2.0 >>>>>>> +# Copyright (c) 2024 Western Digital Corporation. All Rights Reserved. >>>>>>> +# >>>>>>> +# FS QA Test No. 744 >>>>>>> +# >>>>>>> +# Inspired by btrfs/273 and generic/015 >>>>>>> +# >>>>>>> +# This test stresses garbage collection in file systems >>>>>>> +# by first filling up a scratch mount to a specific usage point with >>>>>>> +# files of random size, then doing overwrites in parallel with >>>>>>> +# deletes to fragment the backing zones, forcing reclaim. >>>>>>> + >>>>>>> +. ./common/preamble >>>>>>> +_begin_fstest auto >>>>>>> + >>>>>>> +# real QA test starts here >>>>>>> + >>>>>>> +_require_scratch >>>>>>> + >>>>>>> +# This test requires specific data space usage, skip if we have compression >>>>>>> +# enabled. >>>>>>> +_require_no_compress >>>>>>> + >>>>>>> +M=$((1024 * 1024)) >>>>>>> +min_fsz=$((1 * ${M})) >>>>>>> +max_fsz=$((256 * ${M})) >>>>>>> +bs=${M} >>>>>>> +fill_percent=95 >>>>>>> +overwrite_percentage=20 >>>>>>> +seq=0 >>>>>>> + >>>>>>> +_create_file() { >>>>>>> + local file_name=${SCRATCH_MNT}/data_$1 >>>>>>> + local file_sz=$2 >>>>>>> + local dd_extra=$3 >>>>>>> + >>>>>>> + POSIXLY_CORRECT=yes dd if=/dev/zero of=${file_name} \ >>>>>>> + bs=${bs} count=$(( $file_sz / ${bs} )) \ >>>>>>> + status=none $dd_extra 2>&1 >>>>>>> + >>>>>>> + status=$? >>>>>>> + if [ $status -ne 0 ]; then >>>>>>> + echo "Failed writing $file_name" >>$seqres.full >>>>>>> + exit >>>>>>> + fi >>>>>>> +} >>>>> >>>>> I wonder, is there a particular reason for doing all these file >>>>> operations with shell code instead of using fsstress to create and >>>>> delete files to fill the fs and stress all the zone-gc code? This test >>>>> reminds me a lot of generic/476 but with more fork()ing. >>>> >>>> /me has the same confusion. Can this test cover more things than using >>>> fsstress (to do reclaim test) ? Or does it uncover some known bugs which >>>> other cases can't? >>> >>> ah, adding some more background is probably useful: >>> >>> I've been using this test to stress the crap out the zoned xfs garbage >>> collection / write throttling implementation for zoned rt subvolumes >>> support in xfs and it has found a number of issues during implementation >>> that i did not reproduce by other means. >>> >>> I think it also has wider applicability as it triggers bugs in btrfs. >>> f2fs passes without issues, but probably benefits from a quick smoke gc >>> test as well. Discussed this with Bart and Daeho (now in cc) before >>> submitting. >>> >>> Using fsstress would be cool, but as far as I can tell it cannot >>> be told to operate at a specific file system usage point, which >>> is a key thing for this test. >> >> As a random test case, if this case can be transformed to use fsstress to cover >> same issues, that would be nice. >> >> But if as a regression test case, it has its particular test coverage, and the >> issue it covered can't be reproduced by fsstress way, then let's work on this >> bash script one. >> >> Any thoughts? > > Yeah, I think bash is preferable for this particular test case. > Bash also makes it easy to hack for people's private uses. > > I use longer versions of this test (increasing overwrite_percentage) > for weekly testing. > > If we need fsstress for reproducing any future gc bug we can add > whats missing to it then. > > Does that make sense? > Hey Zorro, Any remaining concerns for adding this test? I could run it across more file systems(bcachefs could be interesting) and share the results if needed be. Thanks, Hans