On Tue, Jul 26, 2022 at 02:29:48PM +0800, Qu Wenruo wrote: > The new test case will verify that btrfs can handle one corrupted device > without affecting the consistency of the filesystem. > > Unlike a missing device, one corrupted device can return garbage to the fs, > thus btrfs has to utilize its data/metadata checksum to verify which > data is correct. > > The test case will: > > - Create a small fs > Mostly to speedup the test > > - Fill the fs with a regular file > > - Use fsstress to create some contents > > - Save the fssum for later verification > > - Corrupt one device with garbage but keep the primary superblock > untouched > > - Run fssum verification > > - Run scrub to fix the fs > > - Run scrub again to make sure the fs is fine > > Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> Works for me, and looks like a nice test to complement btrfs/027. Reviewed-by: Boris Burkov <boris@xxxxxx> > --- > tests/btrfs/261 | 94 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/261.out | 2 + > 2 files changed, 96 insertions(+) > create mode 100755 tests/btrfs/261 > create mode 100644 tests/btrfs/261.out > > diff --git a/tests/btrfs/261 b/tests/btrfs/261 > new file mode 100755 > index 00000000..15218e28 > --- /dev/null > +++ b/tests/btrfs/261 > @@ -0,0 +1,94 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 261 > +# > +# Make sure btrfs raid profiles can handling one corrupted device > +# without affecting the consistency of the fs. > +# > +. ./common/preamble > +_begin_fstest raid > + > +. ./common/filter > +. ./common/populate > + > +_supported_fs btrfs > +_require_scratch_dev_pool 4 > +_require_fssum > + > +prepare_fs() > +{ > + local profile=$1 > + > + # We don't want too large fs which can take too long to populate > + # And the extra redirection of stderr is to avoid the RAID56 warning > + # message to polluate the golden output > + _scratch_pool_mkfs -m $profile -d $profile -b 1G >> $seqres.full 2>&1 > + if [ $? -ne 0 ]; then > + echo "mkfs $mkfs_opts failed" > + return > + fi > + > + # Disable compression, as compressed read repair is known to have problems > + _scratch_mount -o compress=no > + > + # Fill some part of the fs first > + $XFS_IO_PROG -f -c "pwrite -S 0xfe 0 400M" $SCRATCH_MNT/garbage > /dev/null 2>&1 > + > + # Then use fsstress to generate some extra contents. > + # Disable setattr related operations, as it may set NODATACOW which will > + # not allow us to use btrfs checksum to verify the content. > + $FSSTRESS_PROG -f setattr=0 -d $SCRATCH_MNT -w -n 3000 > /dev/null 2>&1 > + sync > + > + # Save the fssum of this fs > + $FSSUM_PROG -A -f -w $tmp.saved_fssum $SCRATCH_MNT > + $BTRFS_UTIL_PROG fi show $SCRATCH_MNT >> $seqres.full > + _scratch_unmount > +} > + > +workload() > +{ > + local target=$(echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $1}') > + local profile=$1 > + local num_devs=$2 > + > + _scratch_dev_pool_get $num_devs > + echo "=== Testing profile $profile ===" >> $seqres.full > + rm -f -- $tmp.saved_fssum > + prepare_fs $profile > + > + # Corrupt the target device, only keep the superblock. > + $XFS_IO_PROG -c "pwrite 1M 1023M" $target > /dev/null 2>&1 > + > + _scratch_mount > + > + # All content should be fine > + $FSSUM_PROG -r $tmp.saved_fssum $SCRATCH_MNT > /dev/null > + > + # Scrub to fix the fs, this is known to report various correctable > + # errors > + $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT >> $seqres.full 2>&1 > + > + # Make sure above scrub fixed the fs > + $BTRFS_UTIL_PROG scrub start -Br $SCRATCH_MNT >> $seqres.full > + if [ $? -ne 0 ]; then > + echo "scrub failed to fix the fs for profile $profile" > + fi > + _scratch_unmount > + _scratch_dev_pool_put > +} > + > +workload raid1 2 > +workload raid1c3 3 > +workload raid1c4 4 > +workload raid10 4 > +workload raid5 3 > +workload raid6 4 Speaking of 027, Can you implement this with _btrfs_profile_configs? Further, you could imagine writing a more generic test that does: for raid in raids: create-fs raid screw-up disk(s) check-condition and make 027 and this new one (and others?) special cases of that. I think this might fall under YAGNI.. Food for thought :) > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/261.out b/tests/btrfs/261.out > new file mode 100644 > index 00000000..679ddc0f > --- /dev/null > +++ b/tests/btrfs/261.out > @@ -0,0 +1,2 @@ > +QA output created by 261 > +Silence is golden > -- > 2.36.1 >