On 2024-05-08 10:51, Zorro Lang wrote: > On Wed, May 08, 2024 at 07:08:01AM +0000, Hans Holmberg wrote: >> On 2024-04-17 16:50, Hans Holmberg wrote: >>> On 2024-04-17 16:07, Zorro Lang wrote: >>>> On Wed, Apr 17, 2024 at 01:21:39PM +0000, Hans Holmberg wrote: >>>>> On 2024-04-17 14:43, Zorro Lang wrote: >>>>>> On Tue, Apr 16, 2024 at 11:54:37AM -0700, Darrick J. Wong wrote: >>>>>>> On Tue, Apr 16, 2024 at 09:07:43AM +0000, Hans Holmberg wrote: >>>>>>>> +Zorro (doh!) >>>>>>>> >>>>>>>> On 2024-04-15 13:23, Hans Holmberg wrote: >>>>>>>>> This test stresses garbage collection for file systems by first filling >>>>>>>>> up a scratch mount to a specific usage point with files of random size, >>>>>>>>> then doing overwrites in parallel with deletes to fragment the backing >>>>>>>>> storage, forcing reclaim. >>>>>>>>> >>>>>>>>> Signed-off-by: Hans Holmberg <hans.holmberg@xxxxxxx> >>>>>>>>> --- >>>>>>>>> >>>>>>>>> Test results in my setup (kernel 6.8.0-rc4+) >>>>>>>>> f2fs on zoned nullblk: pass (77s) >>>>>>>>> f2fs on conventional nvme ssd: pass (13s) >>>>>>>>> btrfs on zoned nublk: fails (-ENOSPC) >>>>>>>>> btrfs on conventional nvme ssd: fails (-ENOSPC) >>>>>>>>> xfs on conventional nvme ssd: pass (8s) >>>>>>>>> >>>>>>>>> Johannes(cc) is working on the btrfs ENOSPC issue. >>>>>>>>> >>>>>>>>> tests/generic/744 | 124 ++++++++++++++++++++++++++++++++++++++++++ >>>>>>>>> tests/generic/744.out | 6 ++ >>>>>>>>> 2 files changed, 130 insertions(+) >>>>>>>>> create mode 100755 tests/generic/744 >>>>>>>>> create mode 100644 tests/generic/744.out >>>>>>>>> >>>>>>>>> diff --git a/tests/generic/744 b/tests/generic/744 >>>>>>>>> new file mode 100755 >>>>>>>>> index 000000000000..2c7ab76bf8b1 >>>>>>>>> --- /dev/null >>>>>>>>> +++ b/tests/generic/744 >>>>>>>>> @@ -0,0 +1,124 @@ >>>>>>>>> +#! /bin/bash >>>>>>>>> +# SPDX-License-Identifier: GPL-2.0 >>>>>>>>> +# Copyright (c) 2024 Western Digital Corporation. All Rights Reserved. >>>>>>>>> +# >>>>>>>>> +# FS QA Test No. 744 >>>>>>>>> +# >>>>>>>>> +# Inspired by btrfs/273 and generic/015 >>>>>>>>> +# >>>>>>>>> +# This test stresses garbage collection in file systems >>>>>>>>> +# by first filling up a scratch mount to a specific usage point with >>>>>>>>> +# files of random size, then doing overwrites in parallel with >>>>>>>>> +# deletes to fragment the backing zones, forcing reclaim. >>>>>>>>> + >>>>>>>>> +. ./common/preamble >>>>>>>>> +_begin_fstest auto >>>>>>>>> + >>>>>>>>> +# real QA test starts here >>>>>>>>> + >>>>>>>>> +_require_scratch >>>>>>>>> + >>>>>>>>> +# This test requires specific data space usage, skip if we have compression >>>>>>>>> +# enabled. >>>>>>>>> +_require_no_compress >>>>>>>>> + >>>>>>>>> +M=$((1024 * 1024)) >>>>>>>>> +min_fsz=$((1 * ${M})) >>>>>>>>> +max_fsz=$((256 * ${M})) >>>>>>>>> +bs=${M} >>>>>>>>> +fill_percent=95 >>>>>>>>> +overwrite_percentage=20 >>>>>>>>> +seq=0 >>>>>>>>> + >>>>>>>>> +_create_file() { >>>>>>>>> + local file_name=${SCRATCH_MNT}/data_$1 >>>>>>>>> + local file_sz=$2 >>>>>>>>> + local dd_extra=$3 >>>>>>>>> + >>>>>>>>> + POSIXLY_CORRECT=yes dd if=/dev/zero of=${file_name} \ >>>>>>>>> + bs=${bs} count=$(( $file_sz / ${bs} )) \ >>>>>>>>> + status=none $dd_extra 2>&1 >>>>>>>>> + >>>>>>>>> + status=$? >>>>>>>>> + if [ $status -ne 0 ]; then >>>>>>>>> + echo "Failed writing $file_name" >>$seqres.full >>>>>>>>> + exit >>>>>>>>> + fi >>>>>>>>> +} >>>>>>> >>>>>>> I wonder, is there a particular reason for doing all these file >>>>>>> operations with shell code instead of using fsstress to create and >>>>>>> delete files to fill the fs and stress all the zone-gc code? This test >>>>>>> reminds me a lot of generic/476 but with more fork()ing. >>>>>> >>>>>> /me has the same confusion. Can this test cover more things than using >>>>>> fsstress (to do reclaim test) ? Or does it uncover some known bugs which >>>>>> other cases can't? >>>>> >>>>> ah, adding some more background is probably useful: >>>>> >>>>> I've been using this test to stress the crap out the zoned xfs garbage >>>>> collection / write throttling implementation for zoned rt subvolumes >>>>> support in xfs and it has found a number of issues during implementation >>>>> that i did not reproduce by other means. >>>>> >>>>> I think it also has wider applicability as it triggers bugs in btrfs. >>>>> f2fs passes without issues, but probably benefits from a quick smoke gc >>>>> test as well. Discussed this with Bart and Daeho (now in cc) before >>>>> submitting. >>>>> >>>>> Using fsstress would be cool, but as far as I can tell it cannot >>>>> be told to operate at a specific file system usage point, which >>>>> is a key thing for this test. >>>> >>>> As a random test case, if this case can be transformed to use fsstress to cover >>>> same issues, that would be nice. >>>> >>>> But if as a regression test case, it has its particular test coverage, and the >>>> issue it covered can't be reproduced by fsstress way, then let's work on this >>>> bash script one. >>>> >>>> Any thoughts? >>> >>> Yeah, I think bash is preferable for this particular test case. >>> Bash also makes it easy to hack for people's private uses. >>> >>> I use longer versions of this test (increasing overwrite_percentage) >>> for weekly testing. >>> >>> If we need fsstress for reproducing any future gc bug we can add >>> whats missing to it then. >>> >>> Does that make sense? >>> >> >> Hey Zorro, >> >> Any remaining concerns for adding this test? I could run it across >> more file systems(bcachefs could be interesting) and share the results >> if needed be. > > Hi, > > I remembered you metioned btrfs fails on this test, and I can reproduce it > on btrfs [1] with general disk. Have you figured out the reason? I don't > want to give btrfs a test failure suddently without a proper explanation :) > If it's a case issue, better to fix it for btrfs. I was surprised to see the failure for brtrfs on a conventional block device, but have not dug into it. I suspect/assume it's the same root cause as the issue Johannes is looking into when using a zoned block device as backing storage. I debugged that a bit with Johannes, and noticed that if I manually kick btrfs rebalancing after each write via sysfs, the test progresses further (but super slow). So *I think* that btrfs needs to: * tune the triggering of gc to kick in way before available free space runs out * start slowing down / blocking writes when reclaim pressure is high to avoid premature -ENOSPC:es. It's a pretty nasty problem, as potentially any write could -ENOSPC long before the reported available space runs out when a workload ends up fragmenting the disk and write pressure is high.. Thanks, Hans (back from a couple of days away from email) > > Thanks, > Zorro > > # ./check generic/744 > FSTYP -- btrfs > PLATFORM -- Linux/x86_64 hp-dl380pg8-01 6.9.0-0.rc5.20240425gite88c4cfcb7b8.47.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 25 14:21:52 UTC 2024 > MKFS_OPTIONS -- /dev/sda4 > MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/scratch > > generic/744 115s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//generic/744.out.bad) > --- tests/generic/744.out 2024-05-08 16:11:14.476635417 +0800 > +++ /root/git/xfstests/results//generic/744.out.bad 2024-05-08 16:46:03.617194377 +0800 > @@ -2,5 +2,4 @@ > Starting fillup using direct IO > Starting mixed write/delete test using direct IO > Starting mixed write/delete test using buffered IO > -Syncing > -Done, all good > +dd: error writing '/mnt/scratch/data_82': No space left on device > ... > (Run 'diff -u /root/git/xfstests/tests/generic/744.out /root/git/xfstests/results//generic/744.out.bad' to see the entire diff) > Ran: generic/744 > Failures: generic/744 > Failed 1 of 1 tests > >> >> Thanks, >> Hans > >