On 2019/1/26 上午2:57, David Sterba wrote: > On Mon, Jan 14, 2019 at 04:20:43PM +0800, Qu Wenruo wrote: >> Kernel commit 64403612b73a ("btrfs: rework >> btrfs_check_space_for_delayed_refs") is introducing a regression for >> btrfs balance performance. >> >> Originally, the workload used in the test case only takes seconds for >> balance on v4.20 while now it takes over 400 seconds for balance. >> >> During that 400 seconds balance, it commits over 2000 transactions just >> for nothing, compare to original several transactions. >> >> Add test cases to detect such regression. > > How do you detect that the test regressed, other than comparing the > runtime recorded between runs of the testsuite? I don't actually know > how this could be done inside the script, as there can't be a fixed > number of seconds to compare with. The run time depends on the > underlying storage. I have a better idea for detecting this, by using super generation as failure criteria. So then we could have a simple FAIL/PASS check. Although, we still need to take a long time to run if we hit that regression. Thanks, Qu > > I takes 14 seconds on a physical machine on a rotational disk and 9 > seconds on a SSD. > > Other than the, I've used this test to validate the fix, for that > > Tested-by: David Sterba <dsterba@xxxxxxxx> > > It would be good to have the test in the testsuite so possibly some > large timeout can be set and the test will print a message, this should > catch attention at least and let us decide. > >> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> >> --- >> Extra note: >> In fact, without the snapshots created in the test case, it would return >> -ENOSPC even we have enough unallocated space. >> --- >> tests/btrfs/180 | 80 +++++++++++++++++++++++++++++++++++++++++++++ >> tests/btrfs/180.out | 2 ++ >> tests/btrfs/group | 1 + >> 3 files changed, 83 insertions(+) >> create mode 100755 tests/btrfs/180 >> create mode 100644 tests/btrfs/180.out >> >> diff --git a/tests/btrfs/180 b/tests/btrfs/180 >> new file mode 100755 >> index 00000000..534fea01 >> --- /dev/null >> +++ b/tests/btrfs/180 >> @@ -0,0 +1,80 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved. >> +# >> +# FS QA Test 180 >> +# >> +# Test if metadata balance would take forever to finish or return ENOSPC even >> +# there there are tons of space. >> +# >> +# This is regression caused by upstream commit 64403612b73a >> +# ("btrfs: rework btrfs_check_space_for_delayed_refs") >> +# >> +seq=`basename $0` >> +seqres=$RESULT_DIR/$seq >> +echo "QA output created by $seq" >> + >> +here=`pwd` >> +tmp=/tmp/$$ >> +status=1 # failure is the default! >> +trap "_cleanup; exit \$status" 0 1 2 3 15 >> + >> +_cleanup() >> +{ >> + cd / >> + rm -f $tmp.* >> +} >> + >> +# get standard environment, filters and checks >> +. ./common/rc >> +. ./common/filter >> + >> +# remove previous $seqres.full before test >> +rm -f $seqres.full >> + >> +# real QA test starts here >> + >> +# Modify as appropriate. >> +_supported_fs btrfs >> +_supported_os Linux >> +_require_scratch >> +_require_xfs_io_command falloc >> + >> +i=0 >> +loop=16384 >> + >> +_scratch_mkfs > /dev/null 2>&1 >> +_scratch_mount >> + >> +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/src" > /dev/null >> + >> +while [ $i -le $loop ]; do >> + # Use small file to create inline extent >> + _pwrite_byte 0x00 0 2K "$SCRATCH_MNT/src/inline_$i" > /dev/null >> + #$XFS_IO_PROG -f -c "falloc 0 4K" "$SCRATCH_MNT/src/regular_$i" > /dev/null > > You can delete the commented command > >> + i=$((i + 1)) >> +done >> + >> +# Create enough snapshots so at space reservation part of relocation, we could >> +# generate enough space pressure >> +for i in $(seq -w 0 16); do >> + $BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/src" \ >> + "$SCRATCH_MNT/snapshot_$i" > /dev/null >> + # touch random files to create some new tree blocks >> + for j in $(seq -w 0 16); do >> + victim="$(ls $SCRATCH_MNT/snapshot_$i | sort -R | head -n1)" >> + touch "$SCRATCH_MNT/snapshot_$i/$victim" >> + done >> +done >> + >> +# Balancing metadata shouldn't be too time consuming, as the amount of metadata >> +# is less than 8M, thus normally it should finish very quick. >> +# >> +# However with that offending commit, it will take forever to finish or return >> +# ENOSPC after a long wait. >> +$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" > /dev/null >> + >> +echo "Silence is golden" >> +# success, all done >> +status=0 >> +exit >> diff --git a/tests/btrfs/180.out b/tests/btrfs/180.out >> new file mode 100644 >> index 00000000..50aba766 >> --- /dev/null >> +++ b/tests/btrfs/180.out >> @@ -0,0 +1,2 @@ >> +QA output created by 180 >> +Silence is golden >> diff --git a/tests/btrfs/group b/tests/btrfs/group >> index 46dd3c95..e724968b 100644 >> --- a/tests/btrfs/group >> +++ b/tests/btrfs/group >> @@ -182,3 +182,4 @@ >> 177 auto quick swap balance >> 178 auto quick send >> 179 auto qgroup dangerous >> +180 auto balance dangerous > > I don't think it falls to the dangerous category, it does not crash the > machine, only that the test could take long. >
Attachment:
signature.asc
Description: OpenPGP digital signature