Re: [PATCH] generic: add gc stress test

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On 2024-04-17 16:50, Hans Holmberg wrote:
> On 2024-04-17 16:07, Zorro Lang wrote:
>> On Wed, Apr 17, 2024 at 01:21:39PM +0000, Hans Holmberg wrote:
>>> On 2024-04-17 14:43, Zorro Lang wrote:
>>>> On Tue, Apr 16, 2024 at 11:54:37AM -0700, Darrick J. Wong wrote:
>>>>> On Tue, Apr 16, 2024 at 09:07:43AM +0000, Hans Holmberg wrote:
>>>>>> +Zorro (doh!)
>>>>>>
>>>>>> On 2024-04-15 13:23, Hans Holmberg wrote:
>>>>>>> This test stresses garbage collection for file systems by first filling
>>>>>>> up a scratch mount to a specific usage point with files of random size,
>>>>>>> then doing overwrites in parallel with deletes to fragment the backing
>>>>>>> storage, forcing reclaim.
>>>>>>>
>>>>>>> Signed-off-by: Hans Holmberg <hans.holmberg@xxxxxxx>
>>>>>>> ---
>>>>>>>
>>>>>>> Test results in my setup (kernel 6.8.0-rc4+)
>>>>>>> 	f2fs on zoned nullblk: pass (77s)
>>>>>>> 	f2fs on conventional nvme ssd: pass (13s)
>>>>>>> 	btrfs on zoned nublk: fails (-ENOSPC)
>>>>>>> 	btrfs on conventional nvme ssd: fails (-ENOSPC)
>>>>>>> 	xfs on conventional nvme ssd: pass (8s)
>>>>>>>
>>>>>>> Johannes(cc) is working on the btrfs ENOSPC issue.
>>>>>>> 	
>>>>>>>      tests/generic/744     | 124 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>>      tests/generic/744.out |   6 ++
>>>>>>>      2 files changed, 130 insertions(+)
>>>>>>>      create mode 100755 tests/generic/744
>>>>>>>      create mode 100644 tests/generic/744.out
>>>>>>>
>>>>>>> diff --git a/tests/generic/744 b/tests/generic/744
>>>>>>> new file mode 100755
>>>>>>> index 000000000000..2c7ab76bf8b1
>>>>>>> --- /dev/null
>>>>>>> +++ b/tests/generic/744
>>>>>>> @@ -0,0 +1,124 @@
>>>>>>> +#! /bin/bash
>>>>>>> +# SPDX-License-Identifier: GPL-2.0
>>>>>>> +# Copyright (c) 2024 Western Digital Corporation.  All Rights Reserved.
>>>>>>> +#
>>>>>>> +# FS QA Test No. 744
>>>>>>> +#
>>>>>>> +# Inspired by btrfs/273 and generic/015
>>>>>>> +#
>>>>>>> +# This test stresses garbage collection in file systems
>>>>>>> +# by first filling up a scratch mount to a specific usage point with
>>>>>>> +# files of random size, then doing overwrites in parallel with
>>>>>>> +# deletes to fragment the backing zones, forcing reclaim.
>>>>>>> +
>>>>>>> +. ./common/preamble
>>>>>>> +_begin_fstest auto
>>>>>>> +
>>>>>>> +# real QA test starts here
>>>>>>> +
>>>>>>> +_require_scratch
>>>>>>> +
>>>>>>> +# This test requires specific data space usage, skip if we have compression
>>>>>>> +# enabled.
>>>>>>> +_require_no_compress
>>>>>>> +
>>>>>>> +M=$((1024 * 1024))
>>>>>>> +min_fsz=$((1 * ${M}))
>>>>>>> +max_fsz=$((256 * ${M}))
>>>>>>> +bs=${M}
>>>>>>> +fill_percent=95
>>>>>>> +overwrite_percentage=20
>>>>>>> +seq=0
>>>>>>> +
>>>>>>> +_create_file() {
>>>>>>> +	local file_name=${SCRATCH_MNT}/data_$1
>>>>>>> +	local file_sz=$2
>>>>>>> +	local dd_extra=$3
>>>>>>> +
>>>>>>> +	POSIXLY_CORRECT=yes dd if=/dev/zero of=${file_name} \
>>>>>>> +		bs=${bs} count=$(( $file_sz / ${bs} )) \
>>>>>>> +		status=none $dd_extra  2>&1
>>>>>>> +
>>>>>>> +	status=$?
>>>>>>> +	if [ $status -ne 0 ]; then
>>>>>>> +		echo "Failed writing $file_name" >>$seqres.full
>>>>>>> +		exit
>>>>>>> +	fi
>>>>>>> +}
>>>>>
>>>>> I wonder, is there a particular reason for doing all these file
>>>>> operations with shell code instead of using fsstress to create and
>>>>> delete files to fill the fs and stress all the zone-gc code?  This test
>>>>> reminds me a lot of generic/476 but with more fork()ing.
>>>>
>>>> /me has the same confusion. Can this test cover more things than using
>>>> fsstress (to do reclaim test) ? Or does it uncover some known bugs which
>>>> other cases can't?
>>>
>>> ah, adding some more background is probably useful:
>>>
>>> I've been using this test to stress the crap out the zoned xfs garbage
>>> collection / write throttling implementation for zoned rt subvolumes
>>> support in xfs and it has found a number of issues during implementation
>>> that i did not reproduce by other means.
>>>
>>> I think it also has wider applicability as it triggers bugs in btrfs.
>>> f2fs passes without issues, but probably benefits from a quick smoke gc
>>> test as well. Discussed this with Bart and Daeho (now in cc) before
>>> submitting.
>>>
>>> Using fsstress would be cool, but as far as I can tell it cannot
>>> be told to operate at a specific file system usage point, which
>>> is a key thing for this test.
>>
>> As a random test case, if this case can be transformed to use fsstress to cover
>> same issues, that would be nice.
>>
>> But if as a regression test case, it has its particular test coverage, and the
>> issue it covered can't be reproduced by fsstress way, then let's work on this
>> bash script one.
>>
>> Any thoughts?
> 
> Yeah, I think bash is preferable for this particular test case.
> Bash also makes it easy to hack for people's private uses.
> 
> I use longer versions of this test (increasing overwrite_percentage)
> for weekly testing.
> 
> If we need fsstress for reproducing any future gc bug we can add
> whats missing to it then.
> 
> Does that make sense?
> 

Hey Zorro,

Any remaining concerns for adding this test? I could run it across
more file systems(bcachefs could be interesting) and share the results 
if needed be.

Thanks,
Hans




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux