On Wed, Oct 19, 2022 at 01:29:55PM +0800, Qu Wenruo wrote: > [BACKGROUND] > There is bug report from btrfs mailing list that, hiberation can allow "hibernation". > one to modify the frozen filesystem unexpectedly (using another OS). > (https://lore.kernel.org/linux-btrfs/83bf3b4b-7f4c-387a-b286-9251e3991e34@xxxxxxxxxxxx/) > > Later btrfs adds the check to make sure the fs is not changed > unexpectedly, to prevent corruption from happening. > > [TESTCASE] > Here the new test case will create a basic filesystem, fill it with > something by using fsstress, then sync the fs, and finally freeze the fs. > > Then corrupt the whole fs by overwriting the block device with 0xcd > (default seed from xfs_io pwrite command). > > Finally we thaw the fs, and try if we can create a new file. > > for EXT4, it will detect the corruption at touch time, causing -EUCLEAN. Heh, yikes. That's pretty scary for ext4 since it still uses buffer heads from the block device to read/store metadata and older kernels are known to have crashing problems if (say) the feature bits in the primary superblock get changed. I wonder if this should force errors=remount-ro for ext4 since errors=continue is dangerous and erorrs=panic will crash the test machine. > For Btrfs, it will detect the corruption at thaw time, marking the > fs RO immediately, and later touch will return -EROFS. What /does/ btrfs check, specifically? Reading this makes me wonder if xfs shouldn't re-read its primary super on thaw to check that nobody ran us over with a backhoe, though that wouldn't help us in the hibernation case. (Or does it? Is userspace/systemd finally smart enough to freeze filesystems?) > For XFS, it will detect the corruption at touch time, return -EUCLEAN. > (Without the cache drop, XFS seems to be very happy using the cache info > to do the work without any error though.) Yep. > Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> > --- > tests/generic/702 | 61 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/702.out | 2 ++ > 2 files changed, 63 insertions(+) > create mode 100755 tests/generic/702 > create mode 100644 tests/generic/702.out > > diff --git a/tests/generic/702 b/tests/generic/702 > new file mode 100755 > index 00000000..fc3624e1 > --- /dev/null > +++ b/tests/generic/702 > @@ -0,0 +1,61 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 702 > +# > +# Test if the filesystem can detect the underlying disk has changed at > +# thaw time. > +# > +. ./common/preamble > +. ./common/filter > +_begin_fstest freeze quick > + > +# real QA test starts here > + > +_supported_fs generic > +_fixed_by_kernel_commit a05d3c915314 \ > + "btrfs: check superblock to ensure the fs was not modified at thaw time" Hmmm, it's not very useful for a test failure on (say) xfs spitting out a message about how this "may" get fixed with a btrfs patch. How about: $FSTYP = btrfs && _fixed_by_kernel_commit a05d3c915314 \ "btrfs: check superbloc..." > + > +# We will corrupt the device completely, thus should not check it after the test. > +_require_scratch_nocheck > +_require_freeze > + > +# Limit the fs to 512M so we won't waste too much time screwing it up later. > +_scratch_mkfs_sized $((512 * 1024 * 1024)) >> $seqres.full 2>&1 > +_scratch_mount > + > +# Populate the fs with something. > +$FSSTRESS_PROG -n 500 -d $SCRATCH_MNT >> $seqres.full > + > +# Sync to make sure no dirty journal > +sync > + > +# Drop all cache, so later write will need to read from disk, increasing > +# the chance of detecting the corruption. > +echo 3 > /proc/sys/vm/drop_caches > + > +$XFS_IO_PROG -x -c "freeze" $SCRATCH_MNT > + > +# Now screw up the block device > +$XFS_IO_PROG -f -c "pwrite 0 512M" -c sync $SCRATCH_DEV >> $seqres.full directio and a larger buffer size to speed this up? e.g. $XFS_IO_PROG -d -c 'pwrite -b 1m 0 512M' -c sync $SCRATCH_DEV > + > +# Thaw the fs, it may or may not report error, we will check it manually later. > +$XFS_IO_PROG -x -c "thaw" $SCRATCH_MNT I'm a little surprised you don't check for btrfs returning an error here...? > +# If the fs detects something wrong, it should trigger error now. > +# We don't use the error message as golden output, as btrfs and ext4 use > +# different error number for different reasons. > +# (btrfs detects the change immediately at thaw time and mark the fs RO, thus > +# touch returns -EROFS, while ext4 detects the change at journal write time, > +# returning -EUCLEAN). > +touch $SCRATCH_MNT/foobar >>$seqres.full 2>&1 > +if [ $? -eq 0 ]; then > + echo "Failed to detect corrupted fs" > +else > + echo "Detected corrupted fs (expected)" > +fi But otherwise this test looks reasonable so far. --D > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/702.out b/tests/generic/702.out > new file mode 100644 > index 00000000..c29311ff > --- /dev/null > +++ b/tests/generic/702.out > @@ -0,0 +1,2 @@ > +QA output created by 702 > +Detected corrupted fs (expected) > -- > 2.38.0 >